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MAKING USE OF CENSUS DATA 
By C. Luruer Fry, Institute of Social and Religious Research 


The purpose of this paper is to show the large and important bodies 
of tabulated but unpublished social data that are available in the files 
of the Census Bureau at Washington, and to indicate their value to 
social research. 

This analysis has been made possible only because the Census Bureau 
itself, in response to a request from the Committee on the Utilization 
of Social Data of the Social Science Research Council, took the trouble 
to prepare a careful statement of all the data regarding the population 
of the United States that it tabulated in connection with the 1920 
Census. By comparing this statement with a list of published Census 
material given in an index to published sources recently compiled by 
the Institute of Social and Religious Research,’ it has been possible to 
ascertain the exact nature of the data which the Government tabulated 
but did not print. The tabulated but unpublished information on 
several different topics, together with a summary of the materials re- 
lating to them which were actually printed in 1920, is presented in the 
accompanying tables covering the population of the United States 
classified by sex, color or race, nativity and parentage; rural population 
data available by counties; the marital status by classes of the coun- 
try’s population; and facts concerning tenure of homes. 

These tables, therefore, make available for the first time a complete 
statement of all the tabulated materials, whether published or unpub- 
lished, that were compiled by the 1920 Census regarding the topics 
mentioned. These particular subjects were chosen for special study 
not merely because of their intrinsic value, but also because the result- 
ing tables are believed to give a fair idea of the relative amounts of 


‘Partial results of the Institute’s study were published in an article by Mary Johnston on “‘In- 
dexing Social Data,” which appeared in the December, 1929, issue of this Journal. 
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published and unpublished data in the files of the Census Bureau. 
The reason that the Government tabulates so much more data than it 
actually prints is because the Census Bureau’s mechanical system of 
tabulation necessarily obtains the printed totals from a series of sub- 
totals. Asa result each Census secures a vast array of sub-totals that 
are never published. In most cases these unprinted materials are too 
particularized to be of interest to the general public, but for research 
workers in the field of the social sciences they are of the utmost im- 
portance, especially if the figures available for small groups or areas 
are supplemented by actual field investigations. Normally any rea- 
sonable request for access to tabulated but unpublished data can be 
met by arranging with the Director of the Census to pay the clerical 
costs involved in copying off the desired figures. At the present 
moment, however, the Census Bureau has not unreasonably taken the 
position that no request of this kind can be granted until the press of 
work involved in taking the 1930 Census is over. 

As an example of the way in which these unpublished data can be 
used may be cited the case of a southern university which decided 
several years ago to undertake special social studies of the city in which 
it was located. Among other things, it wanted to know the distribu- 
tion of Negroes through the community. The published Census vol- 
umes give only detailed information by wards, but these units are 
unsatisfactory because of the well-known political practice of gerry- 
mandering. Because of the arbitrary way in which ward boundaries 
are often drawn one cannot be sure how Negroes and whites are dis- 
tributed within each ward. However, by getting in touch directly 
with the Federal Census Bureau, it was possible to find out how many 
Negroes and how many whites lived in each of the sub-divisions of a 
ward that are known as ‘‘enumeration districts.’ These materials 
enabled the university to plot the distribution of Negroes with great 
accuracy. 

From the standpoint of social research, the information that the 
Census Bureau has tabulated by enumeration districts but which it 
has never published, is so important as to be worth describing more 
precisely. One of the first steps in connection with the tabulation of 
the 1920 Census was to classify for each enumeration district the num- 
ber of males and of females grouped by color and race according to the 
following categories: white, black, mulatto, Indian, Chinese, Japanese, 
Filipino, Hindu, Korean, and other. Thus for every one of the 90,000 
enumeration districts used in taking the 1920 Census, separate totals 
are available showing the population classified by color and sex. 
These returns for 1920 can be obtained, however, only by paying the 
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cost of checking the tabulations, since for the last Census the figures, 
not being intended for presentation, were not finally verified. This 
cost, however, is likely to prove insignificant, since the Census Bureau’s 
original figures are, as a rule, so accurate that the checking process 
involves very few changes. According to present plans the com- 
parable tabulations that will be made in connection with the 1930 
Census will not only be carefully checked from the start but will in- 
volve a large number of items to be tabulated by enumeration districts. 

The problem of verification, which arises in connection with the use 
of unpublished Census figures tabulated by enumeration districts, 
does not occur in relation to the data listed in the accompanying tables 
since all of these materials, both the published and the unpublished, 
have already been carefully checked. 

Even a casual glance at these tables brings out two main points. In 
the first place the bodies of data that the Government tabulates but 
does not publish are extensive. This is clearly shown by the large 
number of T’s given in the tables. Each T represents one or more 
items that have been tabulated but not printed, while the P’s indicate 
the figures that have been actually published in the final Census 
volumes. 

Taken as a group, the tables contain more unprinted than printed 
items. Even this statement under-emphasizes the comparative 
amount of unpublished Census data because of the fact that the T’s 
generally occur in connection with figures that are relatively numerous. 
For example, the tables indicate that many totals for individual coun- 
ties have been tabulated but not published. Thus every single T 
occurring under the heading “individual counties” means that infor- 
mation on a particular point is tabulated but not printed for each one 
of the 3,000 counties in the United States. 

This leads to the second main conclusion to be drawn from the tables, 
namely, that the information which is tabulated but not printed gener- 
ally relates to small units such as towns, counties, small population 
groups and the like. These are the very figures that are most useful 
in making local surveys. Ifa research worker wants to make a socio- 
logical study of an area the totals for the entire district are not likely 
to be illuminating, since they often conceal the very differences that 
the investigator most wants to study. 

Table I brings out the vast wealth of materials available for small 
areas. It shows, for instance, that so far as the rural population by 
counties is concerned, the Government publishes only one figure, the 
total population of each; yet the tabulated but unpublished informa- 
tion covers a whole series of additional items. For every one of the 




















132 American Statistical Association [4 


3,000 counties the white elements of the rural population are grouped 
by sex into four categories: (a) native born of native parentage, (b) 
native born of mixed parentage, (c) native born of foreign parentage 
and (d) foreign born; while the colored population is subdivided into 
(a) Negroes, (b) Indians, (c) Chinese, (d) Japanese and (e) All Others. 
Thus for any county in the United States it is possible to find already 
tabulated the number of Negro men and Negro women included in the 
rural population. Similar totals are available for the native and 
foreign-born elements of the population and for the different oriental 
groups. 

This is not all. Although not falling within the scope of Table I, 
nevertheless it is a fact that every one of the rural classes of the popula- 
tion just enumerated has been further classified by counties to show 
(a) their age groupings, (b) the persons attending school by age classes, 
(c) the illiterates by age groups, (d) the number of dwellings and of 
families and (e) the tenure of homes. Thus the complete Census data 
already tabulated for the rural population by counties are as extensive 
as the list shown in Table II. 

Only one of all these rural figures is actually published for every 
county, although some additional rural information naturally is pub- 
lished for those counties whose populations are exclusively rural, since 
in these cases each published county total is also a rural total. How- 
ever, even the published county totals are far less detailed than unpub- 
lished rural figures listed in Table II. 

The extensive body of tabulated but unpublished data about the 
rural population by counties should furnish a veritable gold mine of 
information for the use of rural sociologists. For instance, not only 
can the composition and characteristics of the rural population by 
counties be analyzed in detail, but these studies can be extended over 
time. The 1930 Census will tabulate rural information by counties 
on much the same basis as that employed in 1920. Thus comparable 
material will be available to contrast the situation today with that of a 
decade ago. In view of the peculiar importance of the so-called “rural 
problem,” this would seem to be a most valuable research project, 
particularly if the rather impersonal Census figures could be supple- 
mented by first-hand field investigations. 

Turning now to Table III, which presents the amount of published 
and unpublished Census data available on the subject of marital status, 
the facts show that for every geographic area and for every population 
and sex group more detailed tabulated information is available than 
was actually published. No information at all was printed regarding 
the marital status of people living in towns having 2,500 to 10,000 in- 
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habitants, nor in small cities with 10,000 to 25,000, yet group totals by 
color and nativity classes are available by states for such ineorporated 
places. Moreover, for individual cities having 25,000 to 100,000 in- 
habitants the Census has tabulated the marital status of the population 
in the same detail as for the largest cities, yet only a fraction of the 
materials available for these medium-sized cities has ever been pub- 
lished. A careful analysis of all the available Census data—both pub- 
lished and unpublished—dealing with marital status should make an 
unusually interesting study, especially if the figures that will be ob- 
tained in connection with the 1930 Census were to be compared with 
data for earlier decades. It is generally believed that the average age 
of marriage in this country is rising; but if so, how explain the fact that 
the 1920 Census found that 10 per cent more of the young women 
between the ages of fifteen and twenty-five years were married than 
was the case in 1900? Since the Government has available in its 
files a great deal of tabulated information on this subject, it would 
seem to be a relatively simple matter to find out in what areas and 
among what population groups this increase has taken place. 

Coming finally to the question of the tenure of American homes, the 
information given in Table IV shows that only a fraction of the tabu- 
lated materials has been printed. The summary actually contains 
150 items that were tabulated but not published, contrasted with 7 
that were printed. No information on this subject was printed by 
counties except a total for the area and for individual incorporated 
places having 10,000 inhabitants or more, yet a whole series of detailed 
figures were tabulated for these very areas. Such figures could be of 
great practical value as an index to economic conditions. 

The figures for home ownership in the 1930 Census should be of even 
greater utility for the reason that the Government has decided to 
introduce the new feature of securing the value of each home that is 
occupant-owned or the monthly rental of each home or apartment that 
is rented. To be of most value these figures should be analyzed by 
small areas since only in this way will they reveal the economic level 
of different neighborhoods. 

This paper has attempted to give only a few illustrative instances of 
the wealth of material tabulated but unpublished by the Census 
Bureau. Thus there is available in the Government files at Washing- 
ton a veritable storehouse of significant social data about local groups 
and areas. If analyzed, these materials should be of the greatest value 
in helping to deal with a multitude of social problems. This is a task 
that should commend itself to everyone interested in the scientific 
study of human society. 
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Making Use of Census Data 


TABLE II 





POPULATION DATA FOR RURAL AREAS OF INDIVIDUAL COUNTIES,' 1920 
(P indicates data printed in the Census volumes while T represents items that have been tabulated but 


Class of population 


All classes........ 
Native white—Native parentage 
Native white—Foreign parentage... . 


Native white—Mixed parentage... .. 


Foreign-born white 


Japanese......... 


Other colored... .. 


not published) 
Total School Dwell- Tenure 
Sex popula- Age _ attend- [IIliter- ings and of 
tion ance acy families homes 

1 2 3 4 5 6 
pba daeeesenens M T P T = = r 4 ry 
F = T T T 1 > 
ss atiaihie M T , 4 T 7 7 
F = y 4 T T 7 ry 
M 7 T 7 T T y 
F T T T 7 T _ 
M T Yi . 4 , T 7 
F T 7 T ? ry = 
ee a a at aratiaiiae M = T T T = T 
F T T T , 4 4 T 
Sida celle oe bias behiod M FY 4 7 r T 7 
F r ? = 4 T 3 
itimkweeseahewted M e T T = T T 
F 4 T rq T T T 
sinha nia asaaaees M T T T - T 7 
F 7 4 7 4 7 T 
vadlphaeuenieinde M = T a 4 T T 
F T = T Yt T T 
eee M T , _ = = = 
F T T  s  y T » g 


1 For every county whose population is entirely rural, the county totals published by the Census 


(See Table 1) are, of course, also rural figures but they are not shown in this table. 


Column 2 


Total—All ages 
Under 1 year 
1 to 4 years 
5 years 
6 years 
7 to 9 years 
10 to 13 years 
14 years 
15 years 
16 and 17 years 
18 and 19 years 
20 years 
21 to 44 years 
45 years 
46 years and over 
Unknown 





KEY TO TABLE II 
Detailed classification used in 


Column 3 Column 4 Column 5 
Total number at- Total number il- Number of dwell- 
tending school literate ings 
5 years 10 to 15 years 
6 years 16 to 20 years Number of 
7 to 9 years 21 years and over families 
10 to 13 years 
14 years 
15 years 


16 and 17 years 
18 and 19 years 
20 years 


Column 6 
Total number of 


Tenure unknown 
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TABLE IV 


TENURE OF HOMES DISTRIBUTED ACCORDING TO PROPRIETORSHIP AND 
ENCUMBRANCE,' 1920 


(P indicates data printed in the Census volumes while T represents items that have been tabulated but 














not published) 
-~—' Individual counties 
Sex of vee , 
xo : = : cities 0 
: United Individual in- 
Class of population head of 50,000 
family | States Urban Urban | “porated places | “and 
Total and Total and |—————————_| over 
rural rural 10,000 | 2,500- 
and over| 10,000 
SR citnesenuianans M T T T T T T T 7 
F T}P T}P T}P T}P T T}P T T}P 
i n+ kscenacemenoemnen M T\ p 
F T 
I dos tcnernie atleast M 7 Pp 
F T A 
-—~ white—Native par- M T T T T T T T 
a ea F > 7 T T T T T 7 
Native white—Foreign par- M = T T T T T T T 
eae F 7 = T T T 4 T T 
Native white—Mixed par- M 4 T T T 4 T T T 
da teea selsueaeda F T T T T T 7 4 - 
F Nm 5 Sa RE: M > T 7 T T 7 T T 
F T T T T T 7 T T 
iit. akvtdibeuweuens M > T T T T T T T 
F T T T 4 7 T T T 
Sn ct aacwhinnbwekdaied M T T T T T T T T 
F 9 T 4 = T T T T 
I cadiinardeeemaeiamal M > T 7  y : 7 T 
F T ? T T T T T T 
Re icdiininabedimndnewinnd M 7 T T T T T T T 
F T T T ey T T T 



































1 For each class the following items are given: number of homes rented, owned (free, mortgaged or unknown as to 
mortgaging) and tenure unknown. 
? Data given only for southern states. 
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THE STANDARD ERROR OF A FORECAST FROM A 
CURVE! 


By Henry Scuuttz, University of Chicago 


I. INTRODUCTION 


“Everybody makes errors in probabilities at times, and big ones.’”’— 
De Morgan. 

The failure of forecasts from a curve to agree with subsequent ob- 
servations may be due either to the choice of the wrong type of curve or 
to our inability to deduce the true values of the parameters of the 
curve. Although it appears impossible to treat by means of functional 
analysis the errors of extrapolation that are due to the choice of the 
wrong type of curve, it is nevertheless important to study the relation 
between the errors of extrapolation and the corresponding errors in the 
parameters, on the assumption that the curve used is the curve that 
ought to be used. 

The object of this paper is to call attention to a formula for the 
standard error of a forecast from a curve derived on this assumption. 
More specifically, it will be shown that such a formula has been at hand 
for the last century in the form of Gauss’ “standard error of a function 
of the unknowns,” but that neither Gauss himself nor the mathema- 
ticians and statisticians who followed him appear to have studied the 
properties of this formula. By means of this formula it is possible to 
express the standard error of a curve in terms of the standard errors of 
its parameters. Since the latter are functions of the independent 
variable (or variables), we may, by assigning to this variable values 
lying within or beyond the range of the observations, obtain the 
standard error of any interpolation or extrapolation. 

The standard errors thus obtained have interesting properties. To 
illustrate these properties a study will be made of the standard errors of 
the following functions: the straight line, the parabola, the cubic, the 


1 Read before the joint meeting of the American Mathematical Society and Section K of the Ameri- 
can Association for the Advancement of Science at Des Moines, December 31, 1929. 

The manuscript was gone over by Professor Edwin B. Wilson, who made acute criticisms and gave 
much useful counsel; and by Professor Sewall Wright, who detected some statements requiring qualifica- 
tion. Professor Raymond Pearl kindly and freely answered questions on the population logistic during 
the progress of the study, and, with Professor Lowell J. Reed, read the manuscript and made helpful com- 
ments. To them as well as to Professors F.C. Mills, H. L. Rietz, F.C. Roos, and to Mr. A. Oppenheim, 
who have also read the manuscript, I owe a debt of gratitude. 

This study was made possible by the generosity of the Local Community Research Committee of the 
University of Chicago, to whom acknowledgment is herewith made. To Miss Ramona Simpson and 
Mr. Roswell H. Whitman, thanks are due for the care and thoroughness with which they carried out the 
mathematical computations. Other helpful assistants have been Miss Elizabeth Millies, Mr. Lester 
S. Kellogg, and Mrs. Ardis T. Monk. 
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general equation of the nth degree, the plane, and the Verhulst-Pearl- 
Reed population logistic. The first three curves will be fitted to the 
same data in order to obtain comparable results. It will be shown that 
the standard error of a forecast from any of these functions increases 
as we lengthen the range of extrapolation—a result which agrees with 
common sense. 

But to know how to compute the standard error of a function, it is 
first necessary to know how to compute the probable values of the 
parameters, their weights, and their standard errors, by the method of 
least squares. The next section is a summary of the formulas, defi- 
nitions, and procedures which are used in these computations. Frequent 
reference will be made to them in the subsequent sections. 


II. THEORETICAL CONSIDERATIONS 
A. STATEMENT OF THE PROBLEM 


Although the method of least squares is most often used to fit curves 
which are linear with respect to their parameters, it can, and for the 
purposes of this paper must, also be used to fit curves which are not 
linear with respect to their parameters. 

A common example of the first class of curves is the second-degree 


parabola y=A+Br+Cz'. (1) 


An example of the second class of curves, to which a good deal of 
attention will be paid in this paper, is the Verhulst-Pearl-Reed popula- 
tion curve B 


Y= LO 


In treatises on least squares the unknown parameters are almost al- 
ways indicated by the letters z, y, z, ... This is confusing to the 
statistician, who is wont to designate by the same letters the several 
variables of the function, the values of which are given by the observa- 
tions, and which are, therefore, among the knowns, and not among the 
unknowns of the problem. To avoid this confusion, it is desirable to 
depart somewhat from the Gaussian notation. 

Let A, B, C,... M (not including L) represent the unknown 
parameters, whose probable values are to be determined; let a, b, c, 

. m (not including 1) represent respectively the coefficients of these 
parameters, i. e., the independent variables; and let 1 represent the 
dependent variable. 

In terms of these symbols, all equations which are linear with respect 
to their parameters are of the form 


(2) 
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(3) 








Thus, by comparing (3) with (1) we see that l=y, a=1, b=z, and 
m=z. 
More generally, let the function be 
l=f(A, B,...M) (4) 


and let us assume that it is not necessarily linear with respect to its 
parameters. Our problem is to determine: (1) the most probable 
values of the parameters A, B, . . . M; (2) the weight of each para- 
meter; (3) the standard error of each parameter; (4) the weight of the 
function, i. e., the weight of any computed ordinate, 1, and (5) the 
standard error of the function, i. e., the standard error of any computed 
ordinate, 1. The last determination gives the standard error of a 
forecast which we are seeking. 

In the solution of this problem we shall assume that the independent 
variables are entirely accurate, i. e., that the failure of any point to 
fall on the fitted curve is due to errors in the dependent variable I, 
the independent variables a, b, . . . m, being free from errors of 
observation. 


B. THE FITTING OF A FUNCTION WHICH IS NOT LINEAR 
WITH RESPECT TO ITS PARAMETERS 


To fit (4) by least squares, we must first reduce it to linear form. 
Let Ao, Bo, . . . Mo be close approximations to A, B, . . . M found 
by trial or other method. Let AA, AB, . . . AM be the corrections 
required to reduce the approximate values to the most probable values, 
so that 

















A=A,+AA, B=Bo+AB,... M=M,+AM. (5) 
Expanding the function f into a Taylor Series, we have, 
f(AotAA, Bot AB, . . . Mo+A4M)=f(Ao, Bo, . . . Mo) ) 
of of of 
— AA+—AB+... AM 
0Ao + 3B. bi +oM. 
°f of of 
—~(AA)?+ ——(AB)*+ ... - AM)? 
+44 FE J+ (AB) + +5 (aM) © 
af af 
AAAB+... AAAM+... 
+ FAcaBe ° + TAM + 
ef 
ABAM+... } 
+ SBM, 
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If Ao, Bo, . . . Mo are very close to the most probable values, so that 
AA, AB, . . . AM are sufficiently small, the second and higher powers 
of the corrections may be neglected, and we have left only the linear 
terms. 

The observation equations, which connect the unknown corrections 
AA, AB, . . . AM with the observed values 1; of the function, and the 
residual errors, ¥;, now become 


of of of 
Ao, Bo, . . . Mc)+—2-SA+—LAB+ ... +L AM=l,+0, (7 
f(Ao, Bo ry Riyal oa +0; (7) 


0 


the partial derivatives varying from observation to observation. 

But for any given observed value of the function, the first term of (7) 
is a constant, being the corresponding value of the function computed 
by substituting the approximate values of the parameters in the func- 
tion f. Subtracting this constant from both sides of (7), we reduce our 
observation equations to the form 


of of of 
FD AA AB+ . ~~ 8.1 
_—""" an. 9M, = ata 





where 
V’;=1;-—f(Ao, Bo, a M,). 


That is, l’ is the difference between any observed value of the function 
and the corresponding value computed from the equation when the 
parameters are given their approximate values. 

With this change, equations (8.1) may be written 


v,=a,AA+b,AB+ *s +mAM —l’; 
Ve = a A4A+bAB+ + % +mAM —l', ian 
(8.2) 

v,=a,A4A+b,AB+ ...+m,AM—-l’, 

where 
i= Sf » = a. , ete. 
dAy dBy 

This shows clearly that the residual errors 1, ve, . . . v, are the dif- 


ferences between the computed /’’s and the observed l’’s. It will be 
noticed that they are also the differences between the observed l’s 
and the computed lI’s. 

There is one such equation for each of the n observations. For con- 
venience in exposition we shall assume that all the observation equa- 
tions are equally trustworthy, i. e., that the weight is the same for each 
equation. If the equations are not of the same weight, they can all be 
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reduced to equations of unit weight, by multiplying each equation by 
the square root of its weight.’ 

It is required to find the values of the unknown corrections AA, 
AB, . . . AM in such a way that these equations may be satisfied as 
nearly as possible, i. e., that the residual errors », v2, . . . v, shall be as 
small as possible. 

By the principle of the method of least squares, of all possible sets of 
values of AA, AB, . . . AM, the most satisfactory is that which renders 
the sum of the squares of the errors a minimum; that is, 


vitvet ... +e, =[v2] 


is to bea minimum. This condition leads, by the well-known pro- 
cedure, to the following m normal equations (n>): 


\ 


[aaJAA+[ab]AB+ .. . +[am]JAM—[al’]=0 
[ab]AA +[bbJAB+ . . . +[bm]AM—[bI']=0 L @) 
[am]AA+[bm]AB+ . . . +[mm]AM—[ml’]=0 ; 





[ ] being the symbol of summation, from which to determine the m 
unknowns AA, AB, ... AM. It will be observed that the coefficients 
of these equations are symmetrical about the principal diagonal. 
When the function to be fitted is linear with respect to its parameters, 
it is not necessary (though it may still be desirable) to use the foregoing 
method of approximation, as the parameters can then be obtained 
directly from the normal equations. In that event the coefficients of 


A, B, ...M are still equal respectively to the coefficients of AA, 
AB, ... AM in (9), but the terms in the last column become [al], 
[bl], . . . [ml] instead of [al’], [bl’], . . . [ml’]. 


The solution yielded by (9), to which we shall have to refer later, 
may be conveniently summarized in the determinantal notation. 
Let D denote the determinant 


[aa] [ab] . . . [am] 
D= [ab] [bb] . . . [bm] (10) 
[am] [bm] . . . [mm] 








and let Dj; denote the cofactor of the element in the i-th row and the 
j-th column. Then the values of AA, AB, . . . AM are known to be 


1 Wright and Hayford, The Adjustment of Observations (1906), Articles 48 and 75. 





























American Statistical Association 








aA = "(a v4 Seer + . . 4 mr] 
D 
_ Dx , r , 
= [a 1422 e+ , +P [mr) bay 
aM => *[a + SDer]+ | +22"[ml 


where the cofactor D;;= D,i, on account of the symmetry of the deter- 
minant about the principal diagonal. 

We shall refer later to a more practical method of solving for the 
unknowns. The determinantal method has, however, certain un- 
disputed theoretical advantages. 

When the corrections have been determined, the values of the 
parameters may be readily found by (5). 


C. DETERMINATION OF THE WEIGHTS AND STANDARD 
ERRORS OF THE PARAMETERS 





Since the normal equations are linear in AA, AB, . . . AM, and also 
in 1’;, U's, . . Un, we may write equations (11) 
AA =ayl’;+a0l’s+ . . . +ernl'n=[al’] ) 
AB=Byl';+ Bol’o+ . . . +Bnl’n=[6l’)] (12) 
AM = pl’ 3+ pel’2 + eee + pal’ n = [l’] 
in which 
D D Dm 
—s. ee 5m 
_ Du Dey Dama 
D@t Dat : . + D™ . (13 
D D 
an > ant p oat : +——™, 
and similarly for the §’s and the y’s. 
Let o44, Cap, - - - Gam denote the standard errors of AA, AB, 
. AM, respectively; and let « denote the standard error, or ‘the 
quadratic mean error” ! of a single observation of unit weight. ‘ Zach 


observation has its own individual error and when we refer to a ‘single 


1 The term “quadratic mean error” is that used by Whittaker and Robinson. Gauss calls € simply 
the ‘mean error,”’ a term still employed by German writers on the method of least squares. 
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observation’ in this connection, we mean an observation such as those in the 
set which is being discussed, not any single one of them, but a hypothetical 
one which is never evaluated, but which is typical of the entire set in so far 
as precision is concerned.” In this sense only may it be taken as the 
common or mean error of the observed values l’;, Il’, ...U,. In 
treatises on least squares it is proven? that its most probable value is 
given by the equation 

_ 
—n—m 


2 


(14) 





€ 


where n is the number of observation equations, and m is the number of 
the unknowns AA, AB, ...AM. If in computing the well-known 
“standard error of estimate,’ S, where 

=(Y observed— Y computed)? 


S*= (15) 





n 


we divide the sum of the squares of the differences not by n, but by n 
less the number of “constants’’ in the regression equation, the result is 
equal to e?, or 

n 





= S*. (16) 

n—m 
In equations (12) the unknowns AA, AB, . . . AM have been ex- 
pressed as linear functions of the l’’s. Since l’,, I's, . . . Un, are, by 


assumption, independent of each other, and since e? is the mean square 
error of each l’, we have, by the formula for “‘the mean square error” 
(or the square of the standard error) of a linear function of several 
independent quantitites,’ 


ong =e tae +... tate =e[aa] 
onp =B je + Bet +... +B? =e(86] (17) 
cam = wie tye + 2. +t =e] 

where [aa]=Za?’, [BB)=ZS*, . . . (up) =p’. 

The reciprocals of the sums [aa], [88], . . . . [uu] may be regarded 
as the weights of AA, AB,...AM. Calling these wy,4, waz, . . . 
Wau, respectively, we have 

" 1 1 1 (18) 
sa @ m=——y © + + & =--——- 
“4” [ea]? “"*  [68) a [ual 


1Ora Miner Leland, Practical Least Squares (1921), p. 161. Italicized by the writer of this paper. 
? Whittaker and Robinson, The Calculus of Observations (1924), Article 124, pp. 243-46. 
* Wright and Hayford, op. cit., Article 53. 
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Hence, by (17), 


2 2 2 
.. ._ 3 ae. ee. 
CsA = > aR = 9° °° Cy = 

WaA WaB WaM 











(19) 


To obtain the standard errors of the corrections to the parameters it is, 
therefore, necessary to compute the standard error of an observation of 
unit weight (€) and the weights of the corrections. Now the value of « 
is given by (14).!. But we have still to learn how to evaluate the 
weights of the corrections—or, rather, their reciprocals [aa], [88], . . . 
[um]. 

But when the function which we are fitting is not linear with respect 
to its parameters, we have assumed that the known approximations to 
A, B, ...M are very close, so that we may consider A = Ao+AA, 
B=B,+AB, ...M=M,+AM. Here Ao, Bo, . . . Mo are absolute 
constants, and have therefore, no errors; while AA, AB, . . . AM are 
very small quantities, liable to error. It follows from this that the 
errors of A, B, . . . M are respectively equal to the errors of the cor- 
rections AA, AB, . . . AM, to a first approximation. Hence 


TA=TnA, TB=TB, - - - TM=Tam- (20) 
But the error of a quantity is related to its weight by (19); and it 


will be shown later that the standard error e of a single observational 
equation of unit weight is approximately the same for the corrections 


AA, AB, ...AM as for the values A, B,...M. We may also 
conclude, therefore, that the weights of AA, AB, . . . AM are, to a 
first approximation, equal to the weights of A, B, ... M, which 


cannot be determined directly because the function is not linear in 
«+. ae 


W4=Wys, Wep=Wap, . . - Wy =Wam. (21) 


It is a consideration in favor of the determinantal method of solving 
the normal equations—see equations (10) and (11)—that, as D, Du, 
Dw, . . . are calculated in order to find AA, AB, AC, . . . , the method 
furnishes the weights without any fresh calculation,’ for it can be proved 
that 


1 A more practical method of computing the sum of the squares of the residuals [rv], and, hence, €, is 
given in texts on least squares. See, for example, Wright and Hayford, op. cit., §106. The economic 
statistician will probably find it more convenient to use the formula in the form given by F. C. Mills, 
Statistical Methods Applied to Economics and Business (1924), p. 569, formula (2). 

* Whittaker and Robinson, op. cit., p. 241. 
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1 Du) 
Wa, D 
1 Dw 
— =[66]=—= 
B D 
—- " : (22.1) 
oe [yy] D 
Sp Ds 
De mel= oe 








1 ” _ Dr ) 

. D | 

_* j= | 

—"” o i. (22.2) 
1 _ —_ Drom 

= an D 





These quantities are necessary when the weight of a linear function 
of the parameters is required. 

But the determinantal method does not furnish a complete check of 
the arithmetical computations. In practice it is more convenient to 
modify Gauss’ method of substitution, or Doolittle’s modification of 
it, so that only one combined solution will be necessary for the para- 
meters and their weights.' 

When the weights of the parameters and « have been determined, 
their standard errors may be found from (19). There remains the 
problem of combining the weights and the standard errors of the param- 
eters so as to obtain the weight and the standard error of the func- 
tion, i. e., of any computed ordinate. 


1 See Wright and Hayford, op. cit.,p. 127. Lack of space prevents the reproduction of the form which 
the author has found very convenient in his work. 
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D. THE STANDARD ERROR OF A FUNCTION ! 


Since the errors of AA, AB, . . . AM are not independent, but are 
connected by the normal equations (9), we cannot apply to the last 
equation the reasoning applicable to the standard error of a linear 
function of independent quantities. We must first get rid of this en- 


tanglement by expressing these quantities, AA, AB, ... AM, in 
terms of l’,, l’2, . . . l’, which are independent of each other. 
From (12) we have 
AA =[al’], AB=[6l', . . . AM=[yl’] (23) 


where the a’s, 6’s, etc. are functions of the a’s, b’s, etc. of (8.2). 
From (6) we obtain, to a first approximation, 


df=f(Aot+AA, Bo+AB, .. . Mo+AM)—f(Ao, Bo, . . . Mo) 
- = ot AA +7 B+ .. + aM | (24) 
Substituting for AA, AB, . . . AM from (23), we obtain 
dj= FE fat SE vit «+ SE at 
->(« tee + + ty. o) 


t=1 


Now the standard errors of all the 1s are the same, since they are 
the standard errors of a hypothetical observational equation of unit 
weight. Furthermore, since /’; is related to 1; by the equation 


L=U';+f(Ao, Bo, . . . Mo), (26) 


and since f(Ao, Bo, . . . Mo) is for any given observation an absolute 
constant and has, therefore, no standard error, the standard error of 
l’; must be equal to the standard error of 1;. 

Now the standard error of a function is the root-mean-square of all 
such deviations as df. Squaring both sides of (25), summing for all 
samples, taking the arithmetic mean, and extracting the square root, 
we have, letting o; denote the standard error of f(A, B, . . . M), 


1 See David Brunt, The Combination of Observations (1917), §53; Whittaker and Robinson, op. cit. 
$123, pp. 242-43. 
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7 af of af \? 
= [aai( -) + 13ai( ty. ” Rite: 
af af af af 
+2[a Fla aBe * +2lowl eM. 
of af \* 
+... +2[By tw I ; (27.1) 


Of the values entering into this formula, ¢ is given approximately by 
(14) or by (16), where [vv] is the sum of the squares of the differences 
between the observed values (ordinates) and the corresponding values 
which have been computed from the non-linear function, 


S(AotAA, Bot+AB, . . . Mot+AM),. 


The sums [aa], [88], . . . . [uu] may be derived from (22.1), or by 
the modified method of substitution referred to on page 147. They 
are the reciprocals of the weights of AA, AB, . . . AM, respectively. 
When multiplied by «’, they give, by virtue of (19) and (20), the stand- 
ard errors of A, B, ... M. 

The sums [af],.. . . [au], . . . [Bu], are functions of the correlations 
between the standard errors of the various parameters, and may be 
computed by (22.2), or more readily by the method of substitution 
referred to above. 

There remain the partial derivatives of the function with respect to 
the parameters, and the products of these derivatives. These are, of 
course, quite easily derived from the function. Thus all the terms of 
(27.1) and, hence, the value of o;, can be determined. 

Making use of the relations given in (22.1) and (22.2), we may write 
(27.1) in the determinantal notation as follows: 


2. Pu 2) Pa( 2." Pref of ) 
7 oe 3A.) | D\OB) + °° * D\aM 








4QDe f F,  oDyu af af 
D OAy OBy D dA, dM, 
Dz, af af y 
2. #8. 3S 
+ +t D dB,aM, aaa 


The expression inside the braces is the reciprocal of the weight of 
the function f, so that (27.2) may be written as 


gat. (27.3) 
ws 
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where wy; stands for the weight of the function. By comparing it with 
(19), it is thus seen that the standard error of a function and the 
standard error of any parameter of the function are given by the same 
general formula. 

Equation (27.1) = (27.2), which like practically all the other for- 
mulas of the method of least squares is due to Gauss,' gives to a first 
approximation the square of the standard error of a function of the 
parameters A, B,...M. It is exact only for a function which is 
linear with respect to its parameters. 

The partial derivatives in this equation are functions of the inde- 
pendent variable (or variables). By giving the independent variable 
in this equation values lying beyond the range of the observations, 
we obtain from it the standard error of the function corresponding to 
these values. This equation gives, therefore, the standard error of 
an extrapolation as well as that of an interpolation. It may be called 
“the standard error of a forecast,” if we remember that it does not 
measure the error due to the choice of the wrong curve, but only that 
arising from the substitution of probable values of the parameters, 
found by the method of least squares, for the true values. 


TABLE I 


SERIES USED IN COMPUTING THE STANDARD ERRORS OF THE 
STRAIGHT LINE, THE PARABOLA, AND THE CUBIC 





























U. S. per capita production of tame hay in short tons 
Year Computed from 
Observed * 

Straight line Parabola Cubic 
1896 —9 0.7602 0.8199 0.7657 0.7870 
1897 —8 0.3083 0.8148 0.7786 0.7857 
1898 —7 0.9006 0.8096 0.7894 0.7865 
1899 4 0.7614 0.8044 0.7981 0.7888 
1900 4 0.6919 0.7993 0.8046 0.7921 
1901 —4 0.7106 0.7941 0.8090 0.7958 
1902 —3 0.8144 0.7889 0.8112 0.7996 
1903 ol 0.8333 0.7838 0.8114 0.8027 
1904 —1 0.8295 0.7786 0.8094 0.8048 
1905 0 0.8582 0.7734 0.8053 0.8053 
1906 +1 0.7657 0.7683 0.7991 0.8037 
1907 +2 0.8187 0.7631 0.7907 0.7994 
1908 +3 0.8727 0.7579 0.7802 0.7919 
1909 +4 0.8129 0.7527 0.7676 0.7808 
1910 +5 0.7462 0.7476 0.7529 0.7654 
1911 +6 0.5818 0.7424 0.7360 0.7453 
1912 +7 0.7587 0.7372 0.7171 0.7200 
1913 +8 0.6595 0.7321 0.6959 0.6888 
1914 +9 0.7104 0.7269 0.6727 0.6514 











* Obtained by dividing production figures as given in the Yearbook of Agriculture, 1927, Table 264, 
p. 927, by U. S. population for January 1 following year, as estimated by Bureau of the Census. 

1 Carl Friedrich Gauss, ‘‘ Theoria combinationis observationum erroribus minimis obnoxiae,’’ Werke, 
Band IV, §29, pp. 34-35. A French translation by Joseph Bertrand was published in Paris (Mallet- 
Bachelier) in 1855. 
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The Standard Error of a Forecast from a Curve 


III. APPLICATIONS 
A. THE STRAIGHT LINE, THE PARABOLA, AND THE CUBIC 


The data which will be used to illustrate the properties of the 
standard errors of these functions are the per capita production of 
tame hay in the United States, 1896-1914. No importance attaches 
to this series beyond the fact that it has certain properties that are 
useful for the purposes of this example. 

Table I shows the observed yearly production and the corresponding 
values derived from the straight line, the parabola, and the cubic. 

The equations of these curves, which were obtained by the method 
of least squares, are respectively: 


Y =0.773,42 —0.005,168,0z (28) 

Y =0.805,30 —0.005,168,02 —0.001,062,72? (29) 

Y =0.805,30 —0.000,490,97z —0.001,062,72?—0.000,086,933z* (30) 
where Y is measured in tons and z in years, the origin of z being at the 
middle of the range, or at 1905. 


Figure I is a graphic representation of the original series and the 
three fitted curves. 


FIGURE I 


THE STRAIGHT LINE (A), PARABOLA (B), AND CUBIC (C) FITTED TO THE PER CAPITA 
PRODUCTION OF TAME HAY IN THE UNITED STATES, 1896-1914 
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1. The Standard Errors of the Parameters 

It is instructive to compute the standard errors of the parameters 
of these functions in addition to the standard errors of the functions 
themselves. We need first the sum of the squares of the residuals 
[v?] for each function and the weight of each parameter, or its reciprocal. 
The [v*]’s may be derived either from the data of Table I, or more 
easily from one of the following formulas:! 


= (31.1) 
[aa] [bb-1] [cc-2] [dd-3] 


[v] =[P] 








or 
{v?] =[?]— A[al]— Bibl] —C{cl]—Djidij— ... (31.2) 


The latter becomes, for the power series under consideration, 
[v*] =[y*]— Aly] — Blzy] —C[a*y]—D[a*y]— . . . (32) 


For the straight line, there exist only the first three terms; for the 
parabola, the first four, etc. 

When [v*] has been obtained, the standard error of an observation of 
unit weight, «, may be easily derived from (14). 

Table II contains a comparison of the [v’]’s and the e’s for the three 
curves. The reader will observe that although [v] is decreased as we 
increase the degrees of freedom (number of parameters) of the function, 
¢ is not necessarily decreased. Judged by the magnitude of e, the cubic 
gives a worse fit than the parabola, although both of these fits are 
superior to that of the straight line. 

The numerical values of the reciprocals of the weights of the para- 
meters and their interrelations are computed at the same time as the 
parameters themselves, either by (22.1) and (22.2), or more conven- 
iently, by solving the normal equations by Gauss’ method of substitu- 
tion, and by modifying the form of solution so that it will also yield the 
reciprocals of the weights.’ 

The algebraic expressions for the weights of the parameters may also 
be obtained by the same methods by substituting n for [aa], =a for 
[ab], 2? for [bb], etc. in the solutions of that method. While this proced- 
ure leads to unwieldy expressions for the weights when the number of 
parameters is greater than three or four, it is instructive to apply it 
when the number of parameters is small, as in the case of the straight 
line and the parabola. 


1 The symbols of (31) are the conventional symbols of Gauss’ method of substitution for solving for the 
parameters and their weights. See Wright and Hayford, op. cit., §106. 

‘Lack of space prevents a description of this method. See, however, Wright and Hayford, op. cit., 
§$§98-103, and note 1, p. 147 of this paper. 
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Remembering that in our example the origin of z is taken at the 
middle of the range, so that the sums of all the odd powers of z are each 
zero, we have from the normal equations for the straight line, the 
determinant 








=| = n>z". 
0 rz 
Hence 
Duy =z? 1 
[aa] = — = —— = — 
D n=z n 
_ Dx - = 
[88] D nde da nod” 
Diy _ 
[af] = D 
TABLE II 


COMPARISON OF THE PARAMETERS OF THREE CURVES FITTED TO THE 
SAME DATA WITH THEIR STANDARD ERRORS, AND 
THE RECIPROCALS OF THEIR WEIGHTS 


























Straight line Parabola Cubic 
A 0.773,42 0. 805,30 0. 805,30 
B —O .005,168,0 —0 .005,168,0 —0 .000,490,97 
— (he eases —0 .001,062,7 —0 .001,062,7 
ee en, anes ogee = oer es —0 .000,086,933 
[v*] 0 .096,809 0.081,503 0.079,190 
e 0.005,694,7 0.005,093,9 0 .005,279,4 
€ 0.075,463 0 .071,372 0 .072,659 
¢,=€V (aa) 0.017,312 0 .024,618 0 .025,062 
o, =eV(p) 0.003,160,8 0 .002,989,4 0.007,683,7 
Rae. BD  anecuecencs 0 .000,612,78 0 .000,623,83 
OS SS Geer Eas Deepen 0.000,131,14 
[aa] =1/w, 0 .052,632 0.118,97 0.118,97 
(86) =1/wp 0.001,754,4 0.001,754,4 0.011,183 
ey errr 0 .000,073,714 0 .000,073,714 
eer a ee see 0 .000,003,257,6 
[af] 0.0 0.0 0.0 
eT, eee —0 .002,211,4 —0 .002,211,4 
re tees fe i siieneumeaiics 0.0 
oS See 8 ee 0.0 0.0 
Me, RN dees, | a a —0 .000,175,26 
I er er, ee eee ree 0.0 
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In our example, n=19, and 22?=570, so that [aa]=0.052,632, and 

[88]=0.001,754,4. These are the values of the reciprocals of the 

weights of A and B, respectively. The vanishing [a8] (which, however, 

is not needed at present) is due solely to the particular choice of origin. 
For the parabola, we have the determinant 


n O 2? 
D= rz? 0 |=nrt2*rrx‘— (r2”)'. 
Dz? 0 yz* 


Hence, after some simplifications, we have 








Duy >z4 
[aa] a ’ 
D_ ndx*— (22°)? 
_De_ 1 
[s5]=—" = —, 
Se ee wr 
D = n>a*—(S2%)?’ 
[of] ime 
Dy Dz? 
[ay] =—# = -__* _, 
D nox'— (Zz*)? 
Des 
= — =0. 
[By] D 


By the same procedure we may determine the corresponding expres- 
sions for the cubic. 

The identical results might, of course, have been obtained by the 
method of substitution. 

The quantities [aa], [88], and [yy] are the reciprocals of the weights of 
the parameters A, B, and C, respectively. (The three remaining 
quantities will be needed later). When the weights of the parameters 
are known, their standard errors may be found by (19). 

Table II enables us to compare the parameters of the three curves 
with their respective standard errors. The same table also gives the 
numerical values of the reciprocals of the weights of the parameters, 
and related constants. In ascertaining the significance of any of these 
parameters, the methods given in R. A. Fisher’s Statistical Methods for 
Research Workers are especially appropriate, since the number of 
observations on which our results are based is small. 
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2. The Standard Errors of the Three Functions 

There remains the task of determining the standard errors of the 
three functions (curves). Since in fitting the three curves under con- 
sideration no use was made of the method of approximation of Section 
II-B, we have A=Ao, B=Bo, etc. The standard error of the straight 
line 


f(z) =Y=A+Bz 


a= et taal( 2) + (pay 22) +2100 2 21) b" 


The partial derivatives are 


af_, af 


— —=z2. 
aA’ aB 


becomes 


Also, since the origin is at the middle of the range of z, [a8]=0. Sub- 
stituting these values in the foregoing equation, we obtain the hyper- 


bola % % 
oy=e{ [aa + (aslo ={os+oxe'} . (33.1) 


Substituting for [aa] and [88] the numerical values which they have in 
the present example—see Table II—we have 


4% 
a= 0.075,463 {0.052,632 +0.001 754,40 


% 
= {0.000,299,72-+0.000,009,006, 32+} ; (33.2) 


This function has a minimum forz=0. As the origin is taken at the 
middle of the range of z, this means that the middle ordinate of a linear 
trend has the lowest standard error. The farther an ordinate is from 
the midpoint of the range, the larger its standard error.! (See Figure 
IT.) 

The general formula for the standard error of a second degree para- 
bola is, by (27.1) 


oy= ef faa LY + 190 LY +1r( LY +2100) f 


+ 2fory} FF 4 o184) fF a" 


dA 8C dB ac 
1 This result has also been obtained by others. See Holbrook Working and Harold Hotelling, ‘‘ Appli- 


cations of the Theory of Error to the Interpretation of Trends,’’ and the present writer's discussion of it, 
in this Journat, Vol. XXIV, No. 165A, March, 1929, Supplement, pp. 73-89. 












156 American Statistical Association — [28 


FIGURE II 


THE STANDARD ERRORS OF THE THREE FUNCTIONS OF FIGURE I, FOR VALUES 
OF X FALLING WITHIN THE RANGE OF OBSERVATION 


(The vertical scale of Figure II is five times that of Figure I) 
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But when the origin is taken at the middle of the range, [a8]=0, and 
[8y]=0. After substituting for the partial derivatives, the equation 
becomes 


% 
os= cfc +[68)2?+[yvy]a*+ Darl , 
% 

= f{aal-+ [961+ 2a )o-+ vi , (34.1) 


When «, [aa], [88], etc. are given the numerical values which they have 
in the present example—see Table II—equation (34.1) becomes 


% 
c/=0.071,87240.118,97 —0.002,668,42*+-0.000,073,7142* . (34.2) 


This function has a maximum value at z=0, and two minimum 
values at 





— a] (68) +2ler] _ 4 ox 
a= a) — OLA - a5 
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It is an increasing function for all values of z lying beyond these two 
minimum values. (See Figure II.) 

The formula for the standard error of a cubic is given by (27.1). 
When the origin is taken at the middle of the range, so that [a6] =[aé] = 
[8y]=[v6] =0, and when the partial derivatives are given their values, 
the formula becomes 


% 
os= faa] +[BB)x* + [yy]a*+ [55]2°+ 2[ay]a?+ zea. 


4 
= e{{aal+( [961+ 2ter|)x+(Inv1+2188} e+ (salt) (35.1) 


By giving «, [aa], [88], etc., their numerical values from Table II, we 
obtain 


cy=0.072,659{0.118,97 +0.006,760,4 z? —0.000,276,80 z* 
4% 
+0.000,003 ,257,6 a} : (35.2) 


To obtain the necessary condition for a maximum or minimum of 
(35.1), we put the first derivative of this function equal to zero. We 
have 


d(cy) e 2 4 

—— =—4{ [68]+2[ay] }+2| [yy]+2[65] }x*+3[56]2* p22 =0. 
dz 2c; 7 

It is clear that the function has a maximum or minimum for z=0. 

We must now investigate the expression inside the braces for maxima 

and minima. Substituting for [88], [ay], etc. their numerical values, 

and putting this expression equal to zero, we have 


0.000,009,772,8 2*—0.000,553,60 z*?+-0.006,760,4 = 0. 


Considering this as a quadratic equation in 2’, and solving it by the 
usual method, we get 

xz? =17.813 or 38.835. 
Hence, 

z= +4,22 or +6.23, 


the origin of x (years) being at 1905. Further investigation shows that 
at +4.22 the standard error has a maximum, and that at +6.23 it has 
a minimum. The standard error increases for values of z beyond 
+ 6.23. 

The cubic trend, then, has a minimum standard error for its mid- 
ordinate, i.e., when z=0. As the ordinate is moved to the right (or to 
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the left) of the middle of the range of z, its standard error at first in- 
creases, attaining a maximum; then decreases, reaching a minimum; 
and thereafter increases without limit for all further deviations of the 
ordinate from the middle of the range. The cubic, then, has three 
minimum and two maximum values for its standard error. (See 


Figure II.) 


3. Comparison of Results 


Figure II is a comparison of the standard errors of the three functions 
for the range of the observations. The vertical scale of this diagram is 
five times as large as that of Figure I. The straight line, the parabola, 
and the cubic are each conceived of as being stretched out and made to 
coincide with the horizontal line 0 0. The curves above and below 
line 0 0 are the standard errors of these functions. 

It will be observed that the middle ordinate does not have a mini- 
mum standard error for all curves. This should have a bearing on the 
theory of oscillatory interpolation, which consists essentially of welding 
together the middle arcs of a series of high-degree parabolas, on the 
theory that the middle arc is more “‘stable” than any other. 

Figure III shows the properties of these standard errors when the 
curves are extrapolated for considerable distances beyond the range of 
observation. Above and below each curve there is laid off twice its 
standard error. 

A glance at this diagram will show that the standard errors of the 
three curves open up as we extrapolate to the right or to the left of the 
observed range. Whichever curve is used, a point is reached beyond 
which further extrapolation is meaningless. This applies to forecasts 
from rational as well as from empirical types of curves. Thus if we 
agree that when a forecast ceases to exceed twice its standard error 
(three times its probable error) it is not statistically significant, then all 
forecasts of the parabola beyond 1923 (x= +18), and all forecasts of the 
cubic beyond 1918 (x= +13), cease to have meaning, for the forecasts 
(ordinates) for all subsequent years are less than twice their standard 
errors. These critical years are given by the intersection of the lower 
boundary of the shaded area f(x) —2c; with the z-axis. 

Reference to Figure III will also show that the straight line is the 
steadiest forecaster. But is it the safest? That depends on whether 
we have manifold proofs of the essential soundness of a linear law of hay 
production, i. e., on whether the straight line is the right kind of curve 
on which to extrapolate. The same answer must be given to the ques- 
tion whether the parabola or the cubic is the safest curve to extrapolate. 
But when we do not know the right curve, it is very dangerous to 
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FIGURE III 


AND CUBIC FITTED TO THE SAME DATA 
Extrapolation of function is shown by dots (¢ @). 


Area shaded LIL. represents f(z) +20,. 
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THE STANDARD ERRORS OF EXTRAPOLATIONS OF THE STRAIGHT LINE, PARABOLA, 


Area shaded Wi) represents f(z) +28. 
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extrapolate on the straight line, because the standard error does not 
open up very much foralong way. It would not be very dangerous, as 
Professor E. B. Wilson points out, to extrapolate on the cubic, because 
the standard error opens up enormously right away. If, then, we take 
the forecasted value to be that of the cubic, plus or minus its standard 
error, the subsequent event will very likely fall within twice the stand- 
ard error, even though the cubic is not at all the curve that should be 
fitted to the data. But if we give as the extrapolated value the one on 
the straight line, plus or minus its standard error, we are very unlikely 
to forecast within twice the standard error—unless the line happens to 
be the right curve. 

This may be illustrated by the data of our example. The observed 
per capita production of hay during the thirteen years from 1915 to 
1927 (not shown in the graphs) shows no detinite secular increase or 
decrease, the observations fluctuating between 0.73 and 0.90 of a ton, 
and averaging 0.81 of a ton. If we compare these observations with 
the corresponding forecasts from each of the three curves, and if we 
adopt for our forecasts the conventional range of f(x) +20,;, which is 
shown in Figure III, we find that for the straight line this range includes 
only five of the thirteen observations; for the parabola, only three; but 
for the cubic, eleven. 


4. The General Equation of the nth Degree 


The results which we have obtained for the standard error of an 
extrapolation from the straight line, parabola, or cubic, are simply 
illustrations of the general property of the standard error of an extra- 
polation from the general equation of the nth degree. 

Let the fitted curve be 


f(z) =Aozr*+A,z""+ ... +An (36) 


where n is a finite positive integer and the coefficients Ao, Ai, . . . An 
do not contain z. 
From (27.1) = (27.2) the standard error of this function is of the form 


op=e{ por +pit'+ por ?+ 2. . +p}. (37) 


The coefficient of x" is positive, since it is a term of the principal diago- 
nal of the determinant (10), which is a sum of squares. It can easily 
be shown that the first term of (37) can be made to exceed the sum of 
all the other terms by giving x a value 2’ sufficiently great.! Thus, the 
standard error o; increases without limit as we extrapolate the function 
(36) beyond +2’. 


1 See Burnside and Panton, Theory of Equations, Vol. I (1904), §4. 
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B. THE PLANE 


The equation of a plane is 
f(z, y)=z=A+Br+Cy, 


the origins of the independent variables (but not of z) being at their 
respective means. But when the origins are so taken, [af] =0, [ay]= 
The standard error of the function then becomes 


a= eh (aol 2) +190 2) +t 2) +2 to Lb". cas.r 


Expressing [aa], [88], etc. as functions of z and y, and substituting 
for the partial derivatives their values 


aA dB’ a 
we have 
1, />y Dz? De a 
oye + (2H) te des (88.2) 


where K = 22*Zy*?— (=zy)?. 
Of the terms inside the braces, 1/n is the reciprocal of the weight of 


Ly? Zz. ; 
A, x is the reciprocal of the weight of B, =x is the reciprocal of 


the weight of C, and at is a function of the correlation between the 


error in B and the error in C. 
By squaring both sides of (38.2) and rearranging terms, we obtain 


SF PEF; 
r (3)z Nar’ « 
An analysis of this equation shows that it represents an elliptic hyper- 
boloid of two sheets. 

In non-technical terms, this means that the standard error (+<,) of 
a plane may be represented by two elliptical bowls situated sym- 
metrically with respect to the fitted plane, with vertices at a distance 
of +¢/Wn from the center of gravity of the plane. The sides of the 
two bowls increase without limit as the fitted plane is extrapolated 
beyond the range of the data. 


If we cut the error surface by a series of planes parallel to the hori- 
zontal, or zy-plane of reference, the curves in which the surface inter- 
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sects these planes are a family of concentric and similar ellipses. If 
we cut the same surface by a series of parallel planes perpendicular to 
the horizontal zy plane, but not necessarily parallel to one of the other 
planes of reference, the curves of intersection (traces) of the surface 
with the planes are a family of hyperbolas. But the trace of the inter- 
section of the fitted plane with any plane perpendicular to the horizontal 
zy-plane of reference, is a straight line. Hence the standard error of 
the plane gives, as a special case, the standard error of the straight 
line. See equation (33.1). 

The important fact which emerges is that the standard error of a 
plane, which is a minimum at the center of gravity of the plane, in- 
creases without limit when the plane is extrapolated in any direction. 

It can be shown that the standard error of a plane in any number of 
dimensions also has this property. 


C. THE POPULATION LOGISTIC 


Because of the increasing general interest in this curve and its ap- 
plication in various fields, the deduction of the correct formula for its 
standard error is of practical, as well as theoretical, importance. 

The equation of this curve is 


sige (39) 


. e477 40 

This is a transcendental function which cannot be fitted directly by 
the method of least squares. But if we are to give up the method of 
least squares as a method of fitting this curve, we must also give up 
the formula for the standard error of a non-linear function, for it is 
based on the method of least squares. Wemust, therefore, find a means 
of reducing this equation to a linear form, for then the method of least 
squares is applicable. 


1. The Pearl-Reed Method of Fitting 


The method of fitting used by Pearl and Reed,' which, however, was 
advanced only as a curve-fitting device and was not meant to be used 
in deducing weights and standard errors, appears to effect this reduc- 
tion in a simpler manner than that explained in Section II of this 
paper. It is, therefore, desirable to examine their procedure from the 
point of view of our present needs. 

The first step in their procedure is to obtain approximate values of 
the three parameters, by passing the curve through three points uni- 
formly spaced along the z-axis.’ 

1 Raymond Pearl, Studies in Human Biology, (1924), pp. 578-79. *Ibid., p. 578. 
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If a better fit is desired than that given by these selected points, 
they use the following method,' which will be referred to henceforth as 
the Pearl-Reed method: 

Let Ao be an approximate value of A, and let AA be the correction 
to Ao, so that A =Ao+AA. 

Then equation (39) may be written 


B 











y ~ gat GA)e4C (40.1) 
where B, C, and AA are the constants to be determined. 
Rewriting this equation and expanding, they obtain 
y= aa 7aP (40.2) 
ete —(aA)e+ p 2 B +... bie 
or approximately, since AA is small, 
y S (40.3) 





~ ety —(AA)e} +0" 
Then, by clearing fractions and rearranging, they finally obtain 
ye“ — (AA) rye” + Cy — B=0. (40.4) 


This is the form of their observation equations. 

“If now,” argue Pearl and Reed, “we proceed by the method of 
least squares and let r, the residual to be minimized, be the amount 
by which an observed point (z, y) fails to satisfy this equation, we 
have for the sum of the squares of the residuals 


Dr? => (ye“** — (AA) rye“ + Cy — B)?.” (41) 


Taking partial derivatives with respect to B, C, and AA, and equat- 
ing to zero, they have the following equations for the determination 
of the three unknowns B, C, and AA 

Bn—C3y+(AA)Z2ye“*” = Dye” 
BSy—Cdy?+(AA)za2y*e4* = Sy*e4@*” . (42) 
B&xye~“* —CEzy*e“** + (AA) E2*y2e 74 = Saye "4* 
Regarding this procedure, the following observations are in order. 


(1) It appears simpler than the standard procedure for reducing 
1I am taking the liberty to substitute the symbols of this paper for those used by the authors. 
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such a non-linear function to a linear form which is explained in 
Section II, in that it calls for a determination of a correction to only 
one parameter, the other two, B, and C, being determined directly by 
equations (42). Jt is not, however, the true least square procedure, 
for the reason that it does not minimize the sum of the squares of the 
deviations between the computed and the observed ordinates. The 
true least square procedure requires that, having obtained (40.3), we 


( ) 


But this, to say the least, is not a convenient procedure. By multi- 
plying through by the denominator of this expression, and by using 
(40.4) for their type of observation equation, our authors have intro- 
duced a system of weighting which is different from that prescribed 
by the standard procedure. In fact, it is difficult to give meaning to 
the “residuals’”’ which they are minimizing by (41). 

(2) The fact that the Pearl-Reed procedure is not a true least 
square procedure does not necessarily mean that it will not give a 
good fit. It may even give a better fit than can be obtained from a 
single application of the true least square procedure.' It cannot be 
used, however, to determine the weights of the parameters. We are 
restricted, therefore, to the standard least square procedure for fitting 
a non-linear function which gives corrections to all of the parameters, 
and which is explained in Section II. 





2. The Least Square Method of Fitting 
Let the logistic be indicated by 
B 


A,B,C) =y= —_—.. 43 
f( )=y LC (43) 


Let Ao, Bo, Co, be close approximations to the parameters, and let 
AA, AB, AC be the unknown corrections which are required to reduce 
the approximate values to the most probable values, so that 


A=A,+AA, B=B,+AB, C=Cy+AC, 
and 


__B____ Bot AB 
e“*+.C e4et44iz4 C4 AC 





f 


1 See pp. 178 et seq. 
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Expanding the function into a Taylor Series, as shown by (6), we 
have, to a first approximation, 
Bo+AB Bo Boxre~“* 
[Ae+AA)z ™ —is Tas -AA 
" ae +Oo+AC e°"4+Cy (€-**+C))? 
1 B 

9 AB—- Aor . . 
e“"+Co (e“" +C)? 
where the coefficients of AA, AB, and AC, are respectively the partial 


derivatives of the function with respect to Ao, Bo, and Co. 
The observation equations which connect the unknown corrections 





AC, (44) 


AA, AB, AC with the observed values y;=l;, ({=1,2, . . . n) of the 
function and with the residual errors, v;, become 
_p—Aoxi 
—_ ae ev a a 





eA*4C, (e“*+Co)? e“A*4C, (e“*+C,)? 


But for any given value of z, the first term of this equation is a 
constant, being the ordinate computed from the logistic when the 
parameters are given their approximate values. It may, therefore, 
be subtracted from both sides of (45), giving 


Borie*™ 1 Bo 
—__———_--AA + ———_- AB —- —.——_AC=I';+0;, (46 
(e4+-C,)? eAt+0, (e“**+- Cy)? ( ) 
where Vel is the difference between any observed ordi- 
é 0 


nate and the corresponding ordinate computed from the equation when 
the parameters are given their approximate values. 

The numerical values of the coefficients of the unknowns AA, AB, 
and AC may be computed for each value of xz. The observation 
equations (46) are now linear with respect to their unknowns. The 
latter may, therefore, be determined by the condition that 

Soe = >» (a,AA+b,AB+c,AC —I’;)?=minimum, (47) 
1=1 t=] 
where a;, b;, and c; are the coefficients of the unknowns. 

Differentiating partially with respect to AA, AB, and AC, and setting 

each derivative equal to zero, we have the three normal equations 


[aaJAA + [ab]AB+[ac]AC —[al’]=0 
[ab]AA + [bb] AB+[bc]AC —[bl’] =0 (48) 
[ac]JAA + [bc] AB+ [cc]AC — [cl’] =0 J 


from which to determine the three corrections. 
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If the solution adopted is the modification of the method of substitu- 
tion referred to in Section II, it will determine not only AA, AB, and 
AC, but also their weights and, hence, their standard errors. And it 
was shown on page 146 that the standard errors of the corrections to 
the parameters are respectively equal to the standard errors of the 
parameters themselves, to a first approximation. 


3. The Parameters of the Logistic and their Standard Errors 


By the Pearl-Reed method, the curve for the population of the United 
States, calculated from the census data for 1790 to 1910 inclusive, is 


2.906879 


- 49 
Y= ¢-00si80622 + 0,0147862 (49) 





in which y represents the population in millions of persons and z 
represents the number of years beyond 1780.' 

Considering the parameters of this equation as close approximations 
to the most probable parameters, so that 


Ao =0.0313962, Bo =2.90688, Co =0.0147862, 


TABLE III 


COMPARISON OF RESULTS OBTAINED BY FITTING THE LOGISTIC TO THE 
POPULATION OF THE UNITED STATES BY THE PEARL-REED 
METHOD AND BY THE METHOD OF LEAST SQUARES 



































Population in Millions 
Weer Calculated by 
Pearl-Reed Least square 
Observed method method (2) — (3) (2) — (4) 

(1) (2) (3) (4) (5) (6) 
1790 3.929 3.900 3.918 +0 .029 +0.011 
1800 5.308 5.300 5.321 +0 .008 013 
1810 7.240 7.183 7.209 +0 .057 +0.031 
1820 9.638 9.702 9.732 —0 .064 —0 .094 
1830 12.866 13.043 13.076 —0.177 —0 .210 
1840 17.069 17.427 17.463 —0 .358 —0 .394 
1850 23.192 23.100 23.135 +0 .092 +0 .057 
1860 31.443 30 .307 30 .337 +1.136 +1.106 
1870 38 .558 39.252 39.273 —0 .694 —0.715 
1880 50.156 50.045 50 .047 +0.111 ;-0.109 
1890 62.948 62.624 62.598 +0 .324 +0 .350 
1900 75.995 76.709 76.647 —0.714 —0 .652 
1910 91.972 91.792 91.685 +0.180 +0 .287 
Z(v) —0.070 —0.127 
Z(v?) 2.60783 2.58899 











1 Op. cit., pp. 579-80. 














39] 


The Standard Error of a Forecast from a Curve 





167 


we first compute the values of 1’, which are given in Table III, column 
(5), and then form the observation equations (47). Solving these by the 
method of least squares, we obtain 


so that the most probable values become 
A=A)o+AA =0.031,351,8 


B =B,+AB =2.921,661 
C =Co+AC =0.014,886,5 


TABLE IV 


AA = —0.000,044,4, AB = +0.014,78, AC = +0.000,100,3, 


THE ERROR INTRODUCED BY NEGLECTING THE SECOND DEGREE TERMS 
OF THE CORRECTIONS IN FITTING THE LOGISTIC BY THE 
METHOD OF LEAST SQUARES 








Values of retained and neglected terms when 









































Terms 
z=10 z=70 z=130 
1 S (Ao, Bo, Co) * 3.900 23.100 91.792 
of 
2 aace4 —0 .001,697,8 —0 .063,378 —0 . 282,53 
Ao 
of 
3 — 0.019,833 0.117,47 0 .466,79 
aBo 
of 
4 —0 .000,524,69 —0.018,406 —) 290,65 
r) 
Sum of terms retained 
5 [(2) to (4)] 0.017,611 0 .035,684 —0 . 106,39 
af 
6 oat (AA)? 0.000,000,724,15 0 .000,150,74 0 .000,107,96 
} 
ay 
7 — (AB)? 0.0 0.0 0.0 
OB. , 
ays 
8 303, OC® 0.000,000,141,18 0 .000,029,333 0.001,840,6 
0 
9 aa “ (AA) (AB) —0 .000,008,633,9 —0 .000,322,29 —0 .001,436,8 
AO 0 
as 
10 (AA) (AC) 0.000,000,456,82 0.000,101,00 0.001,789,2 
8A0dCo 
2 
11 (AB) (AC) —0 .000,002,668,2 —0 .000,093,601 —0 .001,478,0 
OBodCo 
12 Sum of terms neglected 
—0 .000,004,990,0 —0 .000,067,409 0.000,411,48 





% [(6) to (11)] 











* This represents the values obtained from the Pearl-Reed equation (49). 
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and the equation becomes 
2.92166 


= —: (50) 
e~ 081951824 0,014,886,5 





Table III gives a comparison of the results obtained from the two 
procedures. 

Though the fit of (49) is very good, it is not so good as that of (50)— 
the ‘‘true”’ least square fit, which, of course, is to be expected. The 
latter leads to a somewhat better distribution of the points about the 
curve, and to a lower sum for the squares of the residuals. The quad- 


v_, is 0.5107 by the Pearl-Reed method, and 








» > 
ratic mean error, ¢€= \/ 73 


only 0.5088 by the method of least squares. 

That the fit thus obtained by the method of least squares could 
hardly be improved upon by treating the parameters of (50) as first 
approximations and solving for second corrections, may be shown by a 
comparison of the values of the (linear) terms of (44) which have been 
kept, with the terms involving the squares and products of the correc- 
tions which have been dropped. (See (6) ). If the fit is good, the 
quadratic terms are small as compared with the linear terms. Table 
IV gives such a comparison for the calculated populations for 1790, 
1850, and 1910 (z=10, 70, and 130). 

It is clear from this table that the neglected terms are negligible as 
compared with the (linear) terms which have been retained. Com- 
pare lines 5 and 12. This should increase our confidence in the stand- 
ard errors deduced from, or applying to (50). 

The standard errors of the parameters are 


o4= +0.000,626,5, og = +0.125,072, oc = +0.000,448,8. 


4. The Standard Error of the Logistic 
The standard error of the function is by (27.1)! 


a;=e aai( 2¥ oy -) +804 22.) +trn( 24) +9 (= x) 


% 
fory]( Y 2v) (2.2)! 51.1 
+21 (2 fv SU ) + 219n( 22 2 (51.1) 


where y is the function f(Ao, Bo, Co). 


1 It should be observed that the partial derivative with respect to A is associated with [aa], that the 
partial derivative with respect to B is associated with [88], etc., only if the normal equations (48) are so 
arranged that AC is determined first, AB second, and AA third and last. If, for example, AA were deter- 
mined first, it would have to be associated with [yy]. The important fact to be kept in mind is that 
[yy] relates to the first parameter to be obtained from the normal equations, that [8{] relates to the second, 
and that [aaj relates to the third and last, whether the parameters be designated by A, B, C, or by any 
other system of symbols. 





of 


or 


of 


ok 
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It should be emphasized that the partial derivatives of (51.1) must 
be taken with respect to the approximate values of the parameters, 
i. e., they must be obtained from (49) and not from (50), although the 
differences in the results obtained by using (50) are quite small. The 
values of [aa], [86], etc., however, relate to the parameters of (50), 
and are obtained by solving the normal equations (48). This follows 
from the proof of (27.1). 

The partial derivatives are 


yy Bor ff 1 
0Ao (e-4* +)? Bo ; 
Oy _ 1 mm J 

a g™4C, Bo 

oy -—m *¥ 


aCo = (e“*+C,)? Bo 


Substituting in (51.1), and rearranging, we have 


B 2 
= ac —Aor\2 __ —Aor 0 
of +0, | ( ](xe~*™*)? — 2[ay]xe +h) ae 
B by 
+2( aslee-**— [97] ) Pe +[88] ( (51.2) 
or 
os=€ xs ([cxar] (xe)? — 2[ary]ze-4* + trl 
% 
+2((adlee~*— [@y))v+ 0} (51.3) 


where y =f(Ao, Bo, Co). 
By solving (48) by the combined method of substitution, we easily 


«=0.508,821 
£ =0.175,040 
B 


0 
faa]= 0.000,001,516,0 
[88]= 0.060,421,1 
lyy]= 0.000,000,778,16 
[a8] = —0.000,292,76 
lay]= 0.000,000,287,23 
[Sy] = —0.000,004,867,8. 
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Substituting these values in (51.3) and factoring out 10-* from the 
expression inside the braces, we have 


o=0.175,040y(10-) {i151 .60(xe-4**)? — 57.4462e-47 +77.816]y? 


% 
- (58,552ze~*—978.56ly+6,042,110} » (51.4) 


where Ao=0.031,396. 


TABLE V 


THE PAST AND FUTURE POPULATION OF THE UNITED STATES AS 
CALCULATED FROM THE LOGISTIC AND ITS STANDARD ERROR 




















Population in millions Standard error 
, Observed Difference 
Year or Calculated ? between Pee ) Relative 
estimated ! est. or obs. and po ee 
calculated : 
Yo Ye Yo-Ve os of /Ye 
Lower ) 

Aaumete} | ot a ere ae ae 
1610 0 .000,210 0.0142 —0.0139 0.0021 0.147 
1620 0.002,50 0.0194 —0.0169 0.0027 0.141 
1630 0.005,70 0.0265 —0 .0208 0.0036 0.134 
1640 0.027,9 0.0363 —0 .0083 0.0047 0.128 
1650 0.051,7 0.0496 0.0021 0.0061 0.122 
1660 0.084,8 0.0679 0.0170 0.0079 0.116 
1670 0.114 0.0928 0.022 0.010 0.110 
1680 0.156 0.127 0.029 0.013 0.104 
1690 0.214 0.174 0.040 0.017 0.098 
1700 0.275 0.238 0.037 0.022 0.092 
1710 0.358 0.325 0.033 0.028 0.085 
1720 0.474 0.444 0.030 0.035 0.079 
1730 0.655 0.607 0.048 0.044 0.073 
1740 0.889 0.830 0.059 0.056 0.067 
1750 1.207 1.134 0.073 0.069 0.061 
1760 1.610 1.548 0.062 0.085 0.055 
1770 2.205 2.112 0.093 0.103 0.049 
1780 2.781 2.879 —0.098 0.123 0.043 
1790 3.930 3.918 0.012 0.145 0.037 
1800 5.308 5.321 —0.013 0.166 0.031 
1810 7.240 7.209 0.031 0.185 0.026 
1820 9.638 9.732 —0 .094 0.199 0.020 
1830 12.866 13.076 —0.210 0.206 0.016 
1840 17.069 17 .463 —0 .394 0.204 0.012 
1850 23.192 23.135 0.057 0.199 0.009 
1860 31.443 30 .337 1.106 0.205 0.007 
1870 38.558 39.273 —0.715 0.233 0.006 
1880 50.156 50.047 0.109 0.268 0.005 
1890 62.948 62.598 0.350 0.277 0.004 
1900 75.995 76.647 —0 .652 0.275 0.004 
1910 91.972 91.685 0.287 0.459 0.005 

+2.497 | —2.236 
i iar ——— ff #ases 0.951 0.009 
—— |6UlU Mu ——— =3héUC(Ca hl 2.628 0.019 
a ere — 6h3hC(iCi hdl eS 4.729 0.030 
i err ————- 6=3hl(‘SCs hts we 8 6.651 0.038 
— —£ weeeae << i ere. 8.094 0.044 
—_ | Teens ae £ Wetec 10.313 0.053 
pper Re 
Aquat} i 196.263 |  ...... 10.461 0.053 























1 Estimates of the population prior to 1790 taken from W. S. Rossiter, A Century of Population 
Growth in the United States 1790-1900, Table 1, p. 9. 


2 Calculated from equation (50). 
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This is the standard error of equation (50). It gives the standard 
error of the population of the United States for a particular year on 
the assumption that law of growth of the population is that given 
by the logistic. 

Table V presents a comparison between the extrapolated popula- 
tion of the United States and its standard error. Both forward and 
backward extrapolations, with both their absolute and their relative 


FIGURE IV-A 


COMPARISON OF THE POPULATION OF THE UNITED STATES AS FORECASTED BY 
THE LOGISTIC, WITH TWICE ITS STANDARD ERROR 


Observed population is shown by open circles (0 0). 
Forecasted population is shown by dots (e @). 
Shaded area represents range of ‘‘ possible” trend or f(z)+2¢y. 
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standard errors, are shown. The backward extrapolations are also 
compared with the United States Census estimates of the Colonial 
population by decades, 1610-1780. 

The absolute standard error approaches zero as the population 
approaches its lower asymptote, and it approaches the value of 10.461 
as the population approaches its upper asymptote. Between these 
limits it has several temporary maxima and minima. 

The relative standard error opens up for both forward and back- 
ward extrapolations. However, it opens up considerably more for 
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Thus, while the calculated population 


for 1610 has a standard error of 14.7 per cent, the maximum population 


indicated by the logistic has an error of only 5.3 per cent. 
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FIGURE IV-B 
THE PAST AND FUTURE POPULATION OF THE UNITED STATES AS GIVEN BY THE 
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LOGISTIC, WITH TWICE ITS STANDARD ERROR 


































































































It must be kept in mind that these are least square standard errors. 


The computed population before 1790 is also compared with estimates of Colonial population. 
ASYMPTOTE 
Had the Pearl-Reed weights and parameters as deduced from their 


POPULATION 


MILLIONS 








the latter than for the former. 
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normal equations (42) been substituted in equation (51.1), they 
would have yielded much larger standard errors, that for the upper 
asymptotic population being 124 millions!! This shows the impor- 
tance of using the true least square procedure in fitting the population 
logistic. 

Figures IV-A and IV-B are graphic representations of the data of 
Table V. The first shows the forward extrapolation of the logistic 
with twice its standard error. The second shows both the forward 
and the backward extrapolations with twice their standard errors. 
As it is not convenient to represent both forward and backward 
extrapolations on the same arithmetic scale, a ratio scale is used in 
Figure IV-B. The agreement between the estimated Colonial popu- 
lation and the figures obtained by extrapolating the logistic backward 
is remarkably good. For nearly a century preceding the first census, 
the computed population deviates by less than + 2; from the census 
estimates; and (with the exception of the estimates for 1610, 1620, 
and 1630) all other estimates fall only slightly outside of this range. 


5. Comparison with the Pearl-Reed Results 


It is instructive to compare the standard error of the asymptotic 
population which was just deduced with that given by Pearl and Reed. 

Pearl and Reed also solve a system of simultaneous equations in 
order to obtain the standard errors (and hence the probable errors) 
of their parameters,? but they do not determine the standard error of 
the function. By their procedure, the standard error of the upper 
asymptote is +0.82 millions of population,’ which is only 8 per cent 
of my figure of + 10.5 millions. 


6. Experimental Verification 


Instructive as this experiment with backward extrapolations may be, 
it is not so significant or useful as an experiment with forward extra- 
polations. Is it possible to subject the standard error of a forecast to a 
concrete test? 

If we had data for populations which had each completed a full cycle 
of growth, such a test would be entirely possible, for we might then fit 


‘In my first researches into this problem I mistook the Pearl-Reed method of fitting for a true least 
square method, computed the weights and the parameters from the normal equations given on page 579 
of Pearl’s Studies in Human Biology, and obtained the foregoing result which I announced, a few hours 
later, at a round table on the Probable Future Population (Norman Waite Harris Memorial Founda- 
tion), which was held in Chicago, on June 19, 1929, and at which Professor Pearl was present. I am 
exceedingly sorry to have made that premature announcement, and I trust that the present paper will 
put the whole subject of the standard error of the population logistic in a clearer light. 

* Raymond Pearl, op. cit., pp. 581-83. 

* Raymond Pearl, op. cit., p. 592. 

Pearl gives a probable error of +0.55 millions, which is equivalent to a standard error of 0.82 millions. 
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the curve to the data for the first half of each cycle, forecast the popula- 
tion for the remainder of the cycle, and compare the differences between 
the observed and the forecasted populations with respect to their stand- 
ard errors. Such a comparison for different populations would be of 
great practical and theoretical importance. 

Unfortunately, we have no data for any country which has completed 
a full cycle of growth, i. e., whose observed populations are well dis- 
tributed over the two branches of the same logistic, except for Algeria. 
And for this country there are only seven reliable observations (cen- 
suses)\—a number which is too small for such an experiment. We 
must therefore have recourse to an experimental population. 

The history of such a population is provided by T. Carlson’s experi- 
ments on the growth of yeast which are summarized in the first two 
columns of Table VI. Of this experiment Pearl says, 


Suppose we . . . consider what happens when a few yeast cells are dropped 
into an appropriate nutritive solution, saccharine in nature, and the whole kept 


TABLE VI 


COMPARISON OF RESULTS OBTAINED BY FITTING THE LOGISTIC TO A 
POPULATION OF YEAST CELLS BY THE PEARL-REED 
METHOD AND THE METHOD OF LEAST SQUARES 
(FIRST APPROXIMATION) 






































Quantity of yeast Differences between 
Days Calculated by the Observed Observed Standard 
of and and least error 
growth | Observed * Pearl-Reed square of (4) § 
Pearl-Reed | Method of least values values 
method tf squares (first 
approximation) tf (2) -— (3) (2) — (4) 

(1) (2) (3) (4) (5) (6) (7) 
0 9.6 9.9 9.1 —0.3 +0.5 0.4 
1 18.3 16.8 15.6 +1.5 +2.7 0.6 
2 29.0 28.2 26.5 +0.8 +2.5 0.8 
3 47.2 46.7 44.5 +0.5 +2.7 1.1 
4 71.1 76.0 73.3 —t.9 —2.2 1.4 
5 119.1 120.1 117.2 —1.0 +1.9 1.6 
6 174.6 181.9 179.4 —7.3 —.8 1.7 
7 257.3 260.3 258.9 —3.0 —1.6 1.6 
8 350.7 348 .2 348.3 +2.5 +2.4 1.6 
9 441.0 433.9 435.3 +7.1 +5.7 A, 
10 513.3 506 .9 508 .9 +6.4 +4.4 1.7 
ll 559.7 562.3 564.1 —2.6 —i.4 1.5 
12 594.8 600 .8 601.8 —6.0 —7.0 1.3 
13 629.4 625.8 626.1 +3.6 +3.3 1.2 
14 640.8 641.5 641.1 —).7 —0.3 1.3 
15 651.1 651.0 650.1 +0.1 +1.0 1.4 
16 655.9 656.7 655.5 —0.8 +0.4 1.5 
17 659 .6 660.1 658.6 —0.5 +1.0 1.6 
18 661.8 662.1 660.4 —0.3 +1.4 1.6 

| |+22.5 | —27.4) +29.9 | —20.3} 
| 


| 


* T. Carlson's data, given in Table 4, p. 217 of The Biology of Population Growth by Raymond Pearl. 
+ Equation (52). t Equation (53). § Equation (54). 








1 Raymond Pearl, Biology of Population Growth (1925), chap. 3: ‘The Indigenous Native Population 
of Algeria.”’ 
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at a moderately warm temperature. In such a satisfactory environment the 
initially sown cells quickly divide and re-divide. Here plainly we are dealing 
with the growth of population—of yeast cells to be sure, but still a population. 
We can, just as with human beings, take periodic census counts, or their methodo- 
logical equivalent, and so determine the growth of the population. The data 
. . . are from the experimental work of Carlson.' ... The quantity of yeast 
at intervals in the growth was measured by a method of centrifugation and sub- 
sequent determination of volume and mass. So that while the figures . . . are 
not actual census counts of the yeast population, they are numbers directly and 
with a high degree of accuracy proportional to what such counts would have been 
if they had been taken.? 


The logistic fitted by Pearl to the data is 


665. 
y= 


7 1 +- e* 18960-53552 





where y is the quantity of yeast and z is the time of growth in days. 
The smooth curve in Figure V is not the graph of this equation, but of 
equation (53) which will be referred to later. But the difference be- 
tween the two functions can hardly be shown on the scale used. 
Written in the form used in this study, Pearl’s equation becomes 


10.076 


~ ’ 52 
, 0.01515 + e° «63) 





The algebraic sum of the residuals is -4.9. Eight of the observations 
are above the curve and eleven are below the curve. The quadratic 
mean error, ¢, is 3.92. 

A useful test of the adequacy of the standard error formula will be to 
break up this series of nineteen observations into two parts, fit the 
logistic function to the lower part and extrapolate it forward, fit the 
same function to the upper part and extrapolate it backward, and com- 
pare for each part the difference between the observed and the extra- 
polated values with respect to their standard errors. But before pro- 
ceeding with this test it is desirable to see whether Pearl’s fit to 
the nineteen observations might not be improved by a least square 
correction. 

Taking the parameters of (52) as first approximations to the most 
probable parameters, and proceding as we did with the population for 
the United States, we have for our least square corrections and their 
standard errors (which are also the standard errors of the parameters) 

'T. Carlson, “ Uber Geschwindigkeit und Gréese der Hefervermebrung in Wirze,” Biochem. Ztschr. 


Bd. 57, pp. 313-334, 1913. 
* Raymond Pearl, Biology of Population Growth, pp. 9-10 
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AA= _ 0.011,055, a4 =0.005,459,9 
AB= —0.815,39, op =0.413,97 
AC = —0.001,182,6 oc =0.000,604,84. 


The new equation becomes 


_ 9.2609 _ 662.9 
0.013970 +¢-°4662 14-71 58e-0s4662" 


(53) 





The values computed from this equation are given in column (4), 
Table VI. The algebraic sum of the residuals is +9.6. Of the nine- 
teen observations, six lie below the curve and thirteen above. The 
quadratic mean error, ¢, is 3.50. The sums of the linear terms in the 
Taylor expansion ' (which, of course, are kept) for z=0, 8, and 18 are 
—0.79166, +0.72373, —1.5605, respectively. The corresponding 
sums of the quadratic terms (which are neglected), are —0.0004544, 
— 0.018921, and +1.8823. Thus it will be seen that, when x=18, the 
quadratic terms are not small enough to be neglected. In other words, 
while the quadratic mean error of (53) is less than for Pearl’s equation 
(52), the fit is still not good from a strict least square point of view. 
The standard error of (53) is 


o=0.34753y(10-*) {1243.11 2e-)2+ 51.621 26-*+2.9834y" 


+4 


—2(17838xe~"* + 2038,5)y +1,307,500} (54) 


where Ay =0.5355. 

Table VI provides a comparison between the observed population 
and the theoretical populations as computed by the Pearl-Reed method 
(52), and by the method of least squares (53). The same table also 
shows the values of the standard error (54) for the nineteen observa- 
tions. 

Figure V is a graphic representation of (53), which was obtained by 
using the method of least squares, and of the standard error (54) of this 
function. The standard error is so small that it cannot be shown on 
the scale used for the graph of (53). To show its properties, it was 
necessary to represent (53) as stretched out upon and coincident with 
the straight line 0 0, and to plot its standard error on a much-enlarged 
scale. 

It is interesting to observe that the standard error of this function is 
less than its quadratic mean error, e«. Though its standard error is 
quite small, the fact that the observations are not well distributed 


1 See pp. 141-2, 165, and Table IV. 
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about the curve, and the fact that for z= 18, the quadratic terms cannot 
be neglected, both suggest the need for a repetition of the least square 
procedure, using the parameters of (53) as first approximations. But 
as we are interested not so much in getting the best graduation of the 


FIGURE V 


POPULATION OF YEAST CELLS FITTED WITH A LOGISTIC BY THE METHOD OF LEAST 
SQUARES (LOWER CURVE), AND THE STANDARD ERROR OF THE LOGISTIC (UPPER 
CURVE) 


Scale of standard error is 25 times that of the logistic. 





AMOUNT OF 
YEAST 


¢ 2|____—____..- = = SS 








— 
~ 
-— = 
- -——_—-=—- = 
- « -_— -_- = 
~——_——_— mewn ee = -- — oe om me: 








ASYMPTOTE = 662.9 


attT yy) 
/ 








600 


f 


a 1 
7 


200 

















/00 


















































0 2 a 6 8 10 l2 /4 /6 8 =20 


DAYS 











entire series of the nineteen points as in getting light on the standard 
error of a forecast when only half of the observations are used, we shall 
not carry out the second approximation, but proceed directly to the 
experiment. 
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Making an arbitrary division at 7=8.5, and designating the nine 
points from z=0 to z=8 as the lower branch, and the ten points from 
z=9 to x=18 as the upper branch, and fitting the logistic function to 
each branch separately by taking the constants of (53) as first approxi- 
mations to the parameters of the equations of both of these branches, 
we have the following results. 

For the lower branch the equation is 


y= 10.549 a 763.3 
0.01382-+-e-°51%5= — 14+-72.36 e-0-51052” 





(55) 


the origin being at r=0. The algebraic sum of the residuals is + 16.5. 
Of the nine observations, only two fall below the curve. The quadratic 
mean error, ¢, is 4.07, which is larger than the corresponding values for 
the lower branch obtained from (52) and (53). The standard errors of 
the parameters are 


o4=0.026,85, og = 1.102, o¢ = 0.000,955,8. 


The sums of the linear terms in the Taylor expansion for z=0 and r=8 
are +1.2722 and +2.7151 respectively. The corresponding sums for 
the quadratic terms are +0.000,093,8 and —3.7887 respectively. The 
fit of this curve which does not use the data for the upper branch is 
worse than that of (53). 

For the upper branch the equation is 


= 13.093 " 665.1 
0.01969-+ e452 1.4.50.79 e-080452’ 





(56) 


the origin being at z=0. The algebraic sum of the residuals is + 44.6. 
All of the observations lie above the curve. The quadratic mean error « is 
7.18. The standard errors of the parameters are 


o4=0.040,71, cg =3.563, o¢=0.005,308. 


The sums of the linear terms for z=9 and z=18 are +6.6426 and 
+2.1355 respectively. The corresponding quadratic terms which are 
neglected are 7.3768 and 53.733. The fit is not simply bad, it is no fit 
at all. Furthermore, a test showed that the results could not be im- 
proved significantly by using the Pearl constants of (52) as first 
approximations. 
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TABLE VII 


COMPARISON OF THE FORECASTED FUTURE GROWTH OF A POPULATION 
OF YEAST CELLS WITH THE OBSERVED GROWTH AND WITH 
THE STANDARD ERROR OF THE FORECAST 


The forecasts were obtained by extrapolating the logistic fitted to the first nine points. 




















Quantity of yeast Difference Standard error 
Days between 
: “aa lati 
growth | Observed* | Calculated t a Absolute t Nova) 
(2) — (3) 

(1) (2) (3) (4) (5) (6) 
0 9.6 10.5 —0.9 0.6 0.057 
1 18.3 17.3 +1.0 0.8 0.045 
2 29.0 28.4 +0.6 0.9 0.033 
3 47.2 46.2 +1.0 1.0 0.022 
4 71.1 74.2 —3.1 1.0 0.014 
5 119.1 116.4 2.7 1.0 0.009 
6 174.6 176.6 —2.0 1.3 0.007 
7 257.3 256.2 +1.1 1.4 0.006 
8 350.7 351.1 —0.4 2.0 0.006 
9 441.0 451.4 6.3 0.014 
10 513.3 544.7 13.7 0.025 
11 559.7 521.8 22.5 0.036 
12 594.8 679.6 30.9 0.046 
13 629.4 719.6 37.9 0.053 
14 640.8 746.0 43.1 0.058 
15 651.1 762.8 46.8 0.061 

16 655.9 773.2 49.2 0.064 
17 659 .6 779.6 50.8 0.065 

18 661.8 783.5 51.9 0.066 




















*T. Carlson's data, given in Table 4, p. 217 of The Biology of Population Growth, by Raymond Pearl. 
+t Equation (57). 
} Equation (59). 


The results obtained are disappointing. Evidently a second approxi- 
mation is necessary. 

Taking the parameters of (55) as first approximations and applying 
the method of least squares again, we obtain for the lower branch the 
equation 

- 10.599 _ 789.4 me 
, 0.01343-+e-"1= 14.74.47 e-osuie” 





The algebraic sum of the residuals is 0.0. Four of the observations are 
below the curve and five are above. The quadratic mean error is 2.06. 
The standard errors of the parameters are 


o4=0.013,28, og =0.604,8, o¢ =0.000,552,8. 


The sums of the linear terms for =0 and z=8 are 0.053,24 and 6.966. 
The sums of the corresponding quadratic terms are 0.000,011 and 
0.076,18. The neglected terms are small as compared with the linear 
terms. The fit is excellent. 
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FIGURE VI 


COMPARISON OF THE ESTIMATED FUTURE GROWTH OF A POPULATION OF YEAST 
WITH TWICE ITS STANDARD ERROR AND WITH THE OBSERVED POPULATION 


The forecasts were obtained by extrapolating the logistic fitted to the first nine observations. 


Observed population is shown by open circles (0 0). 
Forecasted population is shown by dots (@ e). 
Shaded area represents range of “ possible” trend or f(z) +2cy. 
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For the upper branch we have, for our second approximation, 
13.868 666.1 _ 
y= ~~ (58) 
0.02082 + e—% 0582 





1+48.03 e-50582’ 

the origin being at +=0, the same as for the lower branch. The al- 
gebraic sum of the residuals is +2.1. Of the ten observations six are 
below the curve and four are above. The quadratic mean error « is 
3.15. The standard errors of the parameters are 


o4=0.016,48, og =2.042, o¢ = 0.003,023. 
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The Standard Error of a Forecast from a Curve 


TABLE VIII 


COMPARISON OF THE ESTIMATED PAST GROWTH OF A POPULATION 
OF YEAST CELLS WITH THE ACTUAL GROWTH AND WITH THE 
STANDARD ERROR OF THE ESTIMATE 


The estimates were obtained by extrapolating backward the logistic fitted to the last ten observations. 


























Days Quantity of Yeast mei eeee 5 petapen Standard Error 
rv 
G i Calculated Values 

—_ Observed * Calculated t (2) - (3) Absolute t Raletive 
)/(3) 

(1) (2) (3) (4) (5) (6) 
0 9.6 13.6 2.0 0.145 
1 18.3 22.2 2.8 0.128 
2 29.0 36.0 4.0 0.111 
3 47.2 57.7 5.4 0.093 
4 71.1 90.4 6.8 0.075 
5 119.1 137 .6 7.9 0.057 
6 174.6 200.8 8.1 0.040 
7 257.3 277.8 ven 0.025 
8 350.7 361.3 5.0 0.014 
q 441.0 441.4 —0).4 2.8 0.006 
10 513.3 509.6 +3.7 1.7 0.003 
ll 559.7 562.0 —2.3 1.8 0.003 
12 594.8 599.1 —1.3 1.7 0.003 
13 629.4 624.0 +5.4 1.4 0.002 
14 640.8 640.0 +0.8 1.2 0.002 
15 651.1 650.1 +1.0 1.2 0.002 
16 655.9 656.4 —0.5 1.4 0.002 
17 659 .6 660 .2 —).6 1.6 0.002 
18 661.8 662.5 —0.7 1.8 0.003 














* T. Carlson’s data, given in Table 4, p. 217 of The Biology of Population Growth, by Raymond Pearl. 
+ Equation (58). 
t Equation (60). 


The sums of the linear terms for z=9 and z=18 are 10.39 and 1.345. 
The corresponding sums for the quadratic terms which have been 
neglected are +0.1162 and +1.043 respectively. Though the fit of 
this curve is not so good as that of (57) it is better than that of (52), 
(53), and (55). We shall not, therefore, have recourse to a third ap- 
proximation but use these equations as they stand. 

Assuming that these equations describe satisfactorily the series to 
which they have been fitted, all we need to do to perform our test is 
to extrapolate the curve for the lower branch forward, and to ex- 
trapolate the curve for the upper branch backward, and to compare 
the differences between the observed and the extrapolated values with 
respect to their standard errors. 

Such a comparison for the lower branch is given in Table VII 
and Figure VI, and for the upper branch in Table VIII and Figure VII. 

The standard error derived from the data for the lower branch is 


o7=0.19519 y(10) }{4159.6 (xe-4**)?— 87.749 xe~4* +-7.2072]y* 
4 
—2 (183260 ze~***88.081)y+8,626,600! , (59) 


where Ay=0.5105. 
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FIGURE VII 


COMPARISON OF THE ESTIMATED PAST GROWTH OF A POPULATION OF YEAST 
WITH TWICE ITS STANDARD ERROR AND WITH THE OBSERVED PAST POPULATION 


The estimates were obtained by extrapolating backward the logistic fitted to the last ten observations. 


Observed population is shown by open circles (0 o). 
Forecasted population is shown by dots (@ @). 
Shaded area represents range of “ possible’’ trend or f(z) *2c,. 
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The standard error derived from the data for the upper branch is 


o/= 0.24071 y(10-) (2734.7 (xe-4**)24+-996.28 xe~4*+91.974]y? 


% 
— 2(336850 e~** 4.62116)y-+41,961,0004 » (60) 


where A»=0.5045. 
In Figures VI and VII the function is compared not with its standard 
error but with twice the standard error, which corresponds approxi- 
mately to three times the probable error. 
A glance at Figure VI shows that if, when we reached the eighth ob- 
servation, we had forecasted the population for the upper branch by 
fitting the logistic to the lower branch and extrapolating it, and if we 
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had allowed ourselves the customary range of error of +2c¢,, this 
range would have failed to include the actual future population. 

On the other hand, if we had attempted to reconstruct the most 
probable population for the first half of the cycle by extrapolating back- 
ward the logistic fitted to the second half, and if, as before, we had al- 
lowed ourselves the customary range of +2¢,, this range would have 
caught some of the actual past populations and missed some of the 


others. 
IV. CONCLUSIONS 


1. The standard error of a forecast from a curve is given by the 
standard error of the curve, o;. It must not be confused with the 
standard error of estimate, S, or with the quadratic mean error, ¢, i. e., 
the standard error of a single observation of unit weight, both of which 
are constants. This standard error, however, is a function of the in- 
dependent variable, z. When in the formula for c;, z is given a value 
z’ lying beyond the range of the observations, the standard error relates 
to the particular extrapolated or forecasted ordinate whose abscissa 
is x’. 

2. The standard errors of the straight line, parabola, and cubic, and 
of the plane—in fact, of all parabolas of the higher degree and of all 
planes—have this property in common: they increase indefinitely as 
the curve is extrapolated either to the right or to the left of the ob- 
served range. They differ, however, with respect to the number of 
maximum and minimum points which they have within the range of 
the observations. 

3. The standard error of the population logistic increases as the 
curve is extrapolated toward its upper asymptote, and attains a con- 
stant value when the logistic reaches its asymptote, i. e., when z= + ©. 
It decreases as the curve is extrapolated toward its lower asymptote, 
attaining the value of zerrowhenz=— o. For values of z lying within 
the range of the observations, the standard error of the logistic has 
at least one maximum and one minimum. The relative standard 
error of the lower branch of the curve opens up more than that of the 
upper branch (See Figure IV—B). 

4. When the logistic is fitted to the population of the United States 
from 1790 to 1910, it points to a maximum population, which will be 
practically attained by 2100, of 196 millions. The standard error of 
this population as computed by Pearl and Reed is +0.8 million, which 
is of approximately the same order of magnitude as the ordinary 
standard error of estimate, S. The standard error of this population 
when computed by the methods advocated in this paper is 10.5 millions 
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—a much more reasonable figure. That a forecast from the logistic 
is subject to a much larger standard error than is given by the ac- 
cepted methods, is also suggested by the experiments with the data on 
the growth of yeast. This does not mean, however, that other curves 
may not be subject to still larger standard errors. 

5. There is no necessary relation between the goodness of fit of a 
curve to past observations and its reliability for forecasting purposes. 
A curve may fit the data for the past one hundred years with a high 
degree of accuracy, and yet fail to predict the situation for the next 
year or so. 

6. The new properties of the standard error of a forecast should aid 
us in establishing a maximum interval or period for which a forecast 
may be made and beyond which a forecast becomes highly unreliable 
or even meaningless. 

7. In order to evaluate the standard error of a function or of its 
parameters it is necessary to use the method of least squares. It ap- 
pears that no other method will do. By the substitute methods it is 
possible to compute neither the standard error of a single observation 
nor the weights of the parameters. Furthermore, these methods have 
the additional limitation of yielding no unique solution for the parame- 
ters. Invaluable as they may be—and this is particularly true of the 
graphic method—for a preliminary survey of the results, especially in 
order to determine the proper functional form to be used, they should 
never be employed in the final accurate determination of important 
constants.! 

8. When the function in question is not linear with respect to its 
parameters and has to be reduced to a linear form before the method of 
least squares can be applied, great care must be taken to obtain a good 
least square fit, or the standard error will not be reliable. To get a 
good fit several approximations may be necessary. The statistical 
experiments which have been detailed in this study show that an 
especially sensitive test of the goodness of fit of such a curve is a com- 
parison between the sum of the quadratic terms in the corrections 
which are neglected and the sum of the linear terms which are retained, 
for different values of the independent variable.2 But this comparison 
is not in general a safe estimate of the degree of convergence that one 
has attained by the method of successive approximations. In this, as 
in many other fields, the mathematician can be of great help to the 
statistician. 

9. Finally it must be remembered that the error of an extrapolation 


1 See Raymond T. Birge, ‘‘ Probable Values of the General Physical Constants,"’ The Physical Review 
Supplement, July, 1929, pp. 4-5. ? For an example, see Table IV, p. 167. 
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discussed in this paper is the one due to errors in the determination of 
the parameters, on the assumption that the curve fitted is the curve 
which ought to be fitted and that there are no errors in the independent 
variables. The results obtained do not, therefore, finally settle an 
underlying problem which, in the words of Professor Pearl “perhaps 
never can be settled by mathematics alone.”’ ! 


1 Letter to the present writer. 
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BRITISH STATISTICS AND STATISTICIANS TODAY 


By Haroitp Horenuine, Stanford University 


Modern statistical theory originated in England, and is today ad- 
vancing faster there than in any other country. The validity of 
methods in everyday use in a multitude of varied inquiries rests upon 
results obtained by British mathematical statisticians; for example the 
correlation coefficient and, much later, its interpretation accurately in 
terms of probability, are English discoveries. A brief survey of the 
present activities in the fields of most central statistical importance 
there may be of interest. 

The oldest and best known institution for the study of statistics is 
the Galton Biometric Laboratory, presided over by Karl Pearson, and 
a part of the University of London. L. N. G. Filon, who coéperated 
with Pearson in the famous memoirs of the nineties which first gave 
probable error formulas for the correlation coefficient, is still professor 
of mathematics at University College, which includes the Galton 
Laboratory. Following the interests of its founder, the laboratory 
devotes much attention to heredity and to anthropometric and mental 
measurements, as well as to the pure theory of statistics. Its walls are 
lined with selections from such writers with statistical interests as 
Galton, Darwin, Pasteur, and Florence Nightingale, and with memora- 
bilia, such as cartoons from the London papers, of those associated with 
the institution and with science in bygone years. Advanced students 
come here for short or long periods from all over the world, and it is in 
this way that such Americans as Raymond Pearl, Truman Kelley, 
Henry Schultz, and B. H. Camp have come to know these halls. Each 

student is provided with a desk and a calculating machine, and spends 
his entire working time in the laboratory. The staff includes Miss 
Ethel M. Elderton—joint author with her brother of a primer of 
statistics—Dr. George Stocks, a medical investigator, and Egon S. 
Pearson, son of Karl Pearson. The pioneer work of Karl Pearson is, 
of course, familiar. He is now devoting himself largely to the history 

of statistics. His great services to statistical theory, of which no ac- 
count need be given here, are being ably continued by E. 8. Pearson. 

A newer but extremely active center is the statistical department 
of the Rothamsted Agricultural Experimental Station, at Harpenden, 
a village 25 miles north of London. The station was founded by Sir 
John Lawes to carry on the experiments which he and Gilbert had been 
conducting since about 1839. In some of the fields wheat has been 
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grown continuously under uniform treatment since 1844, providing a 
series of yields much more extensive than has been recorded at the 
younger experiment stations. It is noteworthy that institutions 
founded for such practical purposes as vocational training and increas- 
ing crop yields tend, as experience is accumulated, to give more and 
more attention to pure science. At Rothamsted many departments of 
biology, physics, and chemistry have been advanced. The long series 
of data evidently embodied valuable information, but it was so in- 
volved with confusing factors, such as varying weather and soil hetero- 
geneity, that the existing mathematical methods of treating statistics 
seemed inadequate to disentangle it. To assign to each of the numer- 
ous factors its due portion of the responsibility for the variability of 
yields, and to judge the significance of the result so as to decide whether 
something really had been discovered or whether the apparently 
valuable agricultural conclusions were merely accidental, required in 
fact a number of new discoveries in pure mathematics. 

Sir John Russell, the director of the Rothamsted Station, in 1919 
persuaded R. A. Fisher to tackle the mathematical problems. Fisher 
had studied astronomy at Cambridge, had there become interested in 
least squares and probable errors, and had, while still a student, pub- 
lished a short paper setting forth the “maximum likelihood”’ idea 
which he has developed further in recent years. Leaving Cambridge 
he went to work for a London investment house as statistician. In 
1915 he published in Biometrika the fundamental equation for testing 
the significance of correlation coefficients, obtaining it by an ingenious 
application of n-dimensional geometry. After coming to Rothamsted, 
Fisher spent several years purely on the mathematics of statistical 
inference, publishing his novel ideas from time to time in various 
journals. It was not until 1924 that his great memoir appeared, On 
the Influence of Rainfall on the Yield of Wheat at Rothamsted, in which 
the effect of each kind of season in conjunction with each kind of 
fertilizer used was definitely measured, and in which, moreover, there 
appeared important new ways of using regression equations, orthogonal 
functions, and multiple correlation coefficients, and of interpreting 
the results in terms of probability. 

Fisher has continued in recent years to make fundamental and im- 
portant contributions to statistical theory, as well as to advise the 
workers in other departments at Rothamsted regarding their problems. 
Recently he published in the Philosophical Transactions of the Royal 
Society the exact distribution in samples of the multiple correlation 
coefficient, a result which will supersede various corrections and prob- 
able error formulas which are now being applied as approximations. 
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In the same journal he has just shown how to test the significance of a 
periodogram when, as is always the case in practice, the most favorable 
period is selected and the standard error is judged from the observed 
data themselves. This solves a long-standing puzzle, and will make it 
possible to discriminate among the many and conflicting results of 
seekers after cycles. In the Proceedings of the London Mathematical 
Society he has just given the solution of another problem on which 
many people have worked with only partial success, that of the 
sampling errors of moments. 

Fisher was joined in 1927 by John Wishart, descendant of a famous 
Scotch family, who had studied with Whittaker at Edinburgh, then 
with Pearson at London, had taught mathematics at the Imperial 
College of Science, and had worked with Spearman on the theory of the 
“tetrad differences’? among correlations which are now agitating 
psychologists. Wishart succeeded in generalizing Fisher’s paper of 
1915 so as to give joint distributions of variances and covariances 
among several variables. J. O. Irwin has since joined the staff, and 
has done valuable work with problems of sampling. Miss Frances 
Allen of Australia, who has worked with Yule at Cambridge and is now 
at Rothamsted, is a promising young investigator. Students and 
voluntary workers, agricultural college men and Indian Civil Servants, 
are coming to Harpenden in increasing numbers from all parts of the 
world to learn the mathematics of crop experimentation and the theory 
of statistics. 

A. L. Bowley, whose work is well known in this country, is a resident 
of Harpenden. He is professor of statistics in the London School of 
Economics and Political Science, where E. C. Rhodes is also active. 

British universities have on the whole been slow to introduce statis- 
tics as a subject of instruction. The notable exceptions, besides those 
just mentioned, are Edinburgh, where E. T. Whittaker has developed 
an important statistical center, with a laboratory, in the department of 
mathematics, and Cambridge, which has so important a statistician as 
G. Udny Yule as lecturer and fellow of St. John’s College. Yule is 
alone in his work. There is talk at Cambridge of expanding it, but 
what progress will be made is uncertain. Whittaker is most widely 
known, perhaps, on account of the Calculus of Observations which he 
and G. R. Robinson wrote, but his most important accomplishments 
are in pure mathematics. He has contributed extensively to the 
theory of functions, and I heard him give a presidential address in 
November, 1929, before the London Mathematical Society on a subject 
connected with relativity. Many actuarial students are trained in his 
laboratory. Outside of London, Edinburgh and Cambridge, statistics 
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seems to have found no place as a subject of university instruction, 
except as a very minor phase of economics, psychology or astronomy. 

Among British actuarial statisticians several names have become 
familiar in addition to Whittaker’s and Robinson’s. Thus W. F. 
Sheppard, whose corrections for grouping haye been used for years, is 
still a busy actuary, as are also David Heron and W. Palin Elderton, 
author of Frequency Curves and Correlation, and brother of Ethel M. 
Elderton. In a related field is Dr. Major Greenwood, professor of 
vital statistics in the London School of Hygiene and Tropical Medicine, 
who must be distinguished from his relative, Dr. Arthur Greenwood, 
minister of health in the Labor cabinet. 

Sir Gilbert Walker, of the Imperial College of Science in London, has 
been working for many years over astronomical and meteorological 
statistics, with particular reference to cycles, and has in the process 
developed statistical methods of value in other fields. The same is 
true of David Brunt, author of the well known treatise on the Combi- 
nation of Observations. Chiefly to study the methods of Walker and 
the Pearsons, Dinsmore Alter, of the University of Kansas, is in London 
this year. 

American students of statistics have long speculated as to the identity 
of “Student,”” whose papers of 1908 and 1912 in Biometrika inspired 
R. A. Fisher, and who was the first to take the fundamental step of 
examining the distribution in random samples of the ratio of mean to 
standard error, thus opening the way for escape from the haze of hypo- 
thetical standard errors and inverse probability which have obfuscated 
the theory of statistics. I have heard guesses in this country identi- 
fying “Student”’ with Egon S. Pearson and with the Prince of Wales. 
He is now so well known in Great Britain that no confidence is violated 
in revealing that he is W. S. Gosset, a research chemist employed by a 
large Dublin brewery. This concern years ago adopted a rule for- 
bidding its chemists to publish their findings. Gosset pleaded that his 
mathematical and philosophical conclusions were of no possible practi- 
cal use to competing brewers, and finally was allowed to publish them, 
but under a pseudonym, to avoid difficulties with the rest of the staff. 
Following his example “Sophister”’ and ‘“‘ Mathetes,” younger chemists 
employed by the same brewery, have published contributions to the 
theory of statistics. This same firm has made large grants to the 
Rothamsted station for research on barley, which incidentally has 
stimulated the work in statistics. 

British books on statistics are few and of high quality. Frequently 
the educated Britisher, like the educated continental European, has gone 
through an elementary course in calculus before reaching the age of 
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eighteen, and has no use for long-winded discourses which get nowhere 
in order to avoid elementary mathematics. Thus even those English 
books on statistics written by economists for economists, such as 
the treatises of Bowley and D. Caradog Jones, and of course such 
works as those of Brunt, Elderton, and Whittaker and Robinson, use 
the calculus, as well as the kind of algebra commonly taught to college 
freshmen in this country but taught in Europe at a much earlier stage. 
British economists and economic statisticians are in general too well 
known in this country to need discussion here, but mention should be 
made of J. R. Stevenson and Arthur Newsholme, who have written on 
population and therefore used statistics, of J. M. Keynes and P. Sargent 
Florence. Apart from Yule and Bowley, British economists have not 
shown any strong tendency to introduce new statistical methods. 
Facilities for publication of important results in statistical theory 
have always been somewhat limited, though Biometrika publishes some 
work in addition to that of the Galton Laboratory, and the Journal of 
the Royal Statistical Society contains many short notes and a few longer 
papers of a theoretical nature. The pressure has of late been somewhat 
relieved by sending papers abroad and by the more hospitable attitude 
of the mathematical and semi-mathematical journals to papers of a 
statistical nature. However, a good deal of important work is greatly 
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delayed in publication. 


It is partly due to this fact that it has been so 


little understood in other countries. 


The Scandinavians have indeed 


always followed it, but on account of the language difficulty nobody 


else could follow the Scandinavians. 


Of late the English work has 





received some attention in France in the treatises of Darmois, Borel, 
and others. 

There is an active interest in unsolved problems in statistical theory 
in England. For example the problem of generalizing ‘‘Student’s” 
distribution to non-normal populations was the subject of a lively 
correspondence in the semi-popular journal Nature last autumn. This 
and related problems had been brought to the front in two papers by 
E. 8. Pearson and J.S. Neyman, a Pole. The general level of activity 
in the development of statistics is high, throughout the country, and 
the results are proving valuable. 
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THE USE OF COEFFICIENTS OF NET DETERMINATION 
IN TESTING THE ECONOMIC VALIDITY OF 
CORRELATION RESULTS 


By Raps J. WATKINS 


Experience in research frequently develops uses for mathematical 
devices not contemplated in the first instance. An example of this is 
to be found in the use of coefficients of net determination in checking, 
from an economic standpoint, the validity of the results yielded by 
multiple correlation analysis. 

The coefficient of total determination! is by definition the ratio of 
the square of the standard deviation of the estimates to the square of 
the standard deviation of the dependent variable, which can be shown 
to be equal to the square of the coefficient of correlation (simple or 
multiple),? and measures the proportion of the squared variability in 

1The coefficient of determination was developed by Sewall Wright and certain simplifications have 
been devised by Bradford B. Smith. The formulas and proofs given here follow Smith's exposition. 
See the following references: 

Sewall Wright, ‘‘Correlation and Causation,’’ Journa lof Agricultural Research, 20: 557-585, 1921. 

B. B. Smith, “‘ Forecasting the Acreage of Cotton,”’ this Journnau, 20: 42, 1925. 

B. B. Smith, Correlation Theory and Method Applied to Agricultural Research, U.S. Department of 
Agriculture, Washington, August, 1926, pp. 13-14, 55-57. 

? This proof is that given by Mr. Bradford B. Smith in Correlation Theory and Method Applied to 
Agricultural Research, pp. 13-14. 


For each residual (d) 
d =Y—Ve 


in which y is the dependent variable and y- is the corresponding estimate. 
Squaring, summing, and dividing by N we secure 


Zd? Ty? 2W2yye , Tye 
7 *s ws qa 
But Zyye may be shown to equal Zy?, as follows: 


If b is the slope of the regression line and z is the independent variable, 


rry 
ve = bz, where b = a and hence Ezy = brz* 
=z 


Ye =ybz. 
Summing, Lyye = Zybr = br ry =b*Ez* 
Also, y? =b?z2, 
Summing, Ly? =b*Ez?. 
Therefore, Ly? =Zyve. 
Substituting Zy? for Zyye in (1) 
we have rd? _ ty? _2Ey? + zy? 
y N N N 
= 2° _2v5 
WH (2) 


=d? Dy? 
But V is merely the square of the standard error of estimate (s :). = the square of the standard 
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the dependent variable which is attributable to the independent 
variable, or to the several independent variables in conjunction, in the 
case of multiple correlation. The significant fact in this connection 
is that the coefficient of total determination can be broken down into 
coefficients of net determination, each of which shows the proportion 
of the squared variability in the dependent variable attributable to 
each independent variable and all of which sum to the coefficient of 
total determination.' In short, it becomes possible to say how much 
each of the independents contributed to the degree of quantitative 
relationship established in the correlation study. This, of course, is 
of the utmost usefulness in evaluating the forces which we are seeking 
to understand.? 

It will be recognized at once that the coefficient of determination 
is a mathematical device and that the interpretation of the product 
secured from this device must hinge on the nature of the mathematical 
operations involved. Much as one would like to develop a mathe- 
matical measure for economic research which could be used without 





zy? 
deviation of the dependent variable (). —= the square of the standard deviation of the estimates 


7] 
N 
(+2). 
Rewriting (2), Sy? = oy!—o,2 
or a3 = oy? — Sy? 
a 
The coefficient of total determination is by definition —< . 
cy 
Substituting for oy. the right-hand side of (3) 


oy?— S,? Sy 
or 





we have 
oy? oy? 

which will be recognized as the familiar form for the square of the coefficient of correlation. 
1 Since the coefficient of total determination is equal to the square of the coefficient of multiple 


correlation 
ize _ R? (rewriting the coefficient of total determination in 
o? es? symbols of multiple correlation) 


and since the coefficient of multiple correlation may be written as 


_ biepiz + bispis + dupa t+ . - - (secondary subscripts being omitted from the right- 


Pham... o:2 hand side for simplicity), 





in which bi: is the net regression coefficient of the dependent on the first independent and p:: is the mean 
product of the dependent and the first independent, and similarly for the other coefficients and mean 
products, it becomes possible to apportion the squared variability of the dependent among the several 
independents. Thus, 





ave ais Sears 4. Seapes +. apes + wet 
ei- 71 oi- v1 
Each of the factors on the right-hand side of the equation represents a portion of the squared variability 
in the dependent variable attributable to the appropriate independent variable. 
2 There are certain difficulties of interpretation, particularly when a coefficient of net determination 
appears with negative sign. Cf. Smith: Correlation Theory and Method Applied to Agricultural Research, 


op. cit., p. 57. 
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qualification, one’s sense of reality in this skeptical age will not permit 
him to deduce economic principles from statistical coefficients without 
due regard to their mathematical make-up. There is no prospect for 
the development of a mathematical filter which will automatically 
select, classify, and measure with precision the economic significance of 
numeric representations of economic phenomena. 

These limitations will be clearer on the basis of a concrete problem. 
For convenience, certain examples will be taken from a study of the 
cotton market by Mr. Bradford B. Smith.' One of the preliminary 
analyses has to do with determining the relationship between cotton 
supply and cotton price. The average December spot price of middling 
cotton in cents per pound at New Orleans was taken as the dependent 
variable and the two independent variables were cotton supply and the 
general price level. The general price level was introduced in order 
to make due allowance for changes in the value of the pecuniary unit, 
since these changes would, of course, obscure the supply-price relation- 
ship. Something will be said later concerning this substitute for the 
usual method of deflating value series. Cotton supply was represented 
as the sum of the carry-over at the beginning of the cotton year and the 
production for that year in millions of bales, while the United States 
Bureau of Labor Statistics Index of All Commodity Prices at wholesale 
(1913 = 100) for December was used to measure the price level. Since 
it was believed that the relationships were proportional rather than 
absolute, the logarithms of the three series were correlated. The period 
covered in Mr. Smith’s study was from 1905 to 1924. The data for the 
three succeeding years were added for the purpose of this article, but 
this makes no essential difference. A multiple correlation coefficient 
of .9458 is secured and the coefficient of total determination is .8946. 
It thus appears that 89.46 per cent of the squared variation in t! 
dependent variable is accounted for by the variations in these two 
independents. The coefficients of net determination are: price level, 
.8233; cotton supply, .0713. In short, it appears that changes in 
price level are over eleven times as important as changes in cotton 
supply in accounting for variations in cotton price! 

We are immediately led into an absurd position if we attempt to 
apply this relationship to the economics of the cotton market at a 
particular time. The significant point is that this absurdity does not 
appear from the multiple correlation coefficient, the regression equa- 
tions, or the standard error of estimate, or even from the coefficient of 
total determination. Of our statistical measures, only the coefficients 


1 Factors Affecting the Price of Cotton, Technical Bulletin, No. 50, U. S. Department of Agriculture, 
Washington, January, 1928. 
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of net determination reveal the absurdity. There can be no quarrel 
with the mathematics, assuming the arithmetic accuracy of the cal- 
culations. It is literally true that the series representing the price 
level is over eleven times as important as the series representing cotton 
supply, in accounting for the December price of cotton on the basis of 
the factors considered in this study over this 23-year period. The 
reason for this will be considered presently. The point to be stressed 
here is that the calculation of the coefficients of net determination in 
such a case would save the investigator from a patent absurdity. 
Having accounted for 89.46 per cent of the variation in the dependent, 
as shown by the coefficient of total determination, the investigator 
is apt to be so encouraged that no further critical examination will 
be made. 

A second example from the same study will be cited. This involves 
the problem of forecasting cotton acreage on the basis of the price of 
cotton relative to other farm products at the beginning of the season 
and the percentage change which occurred in cotton acreage during the 
preceding year. The annual cotton acreage harvested from 1902 to 
1926 constituted the dependent variable, and the independent variables 
were as follows:! 


(1) The New York average price of cotton for delivery in March as quoted 
during December of the calendar year of harvest divided by the Bureau of Labor 
Statistics Index of Farm Product Prices at Wholesale for the same December. 

(2) The same as (1), except that it is taken for one year earlier. 

(3) The percentage change that took place in acreage during the year preceding 
the given year of harvest. 

(4) Trend—taken as the last two digits of the year of harvest. 


For the purposes of this article, the data for 1927 were added, the 
second independent variable as listed above was dropped on the 
grounds that it was of little significance, and the third listed independ- 
ent variable was expressed in terms of link relatives rather than as 
positive and negative percentage deviations. Finally, linear multiple 
correlation methods were used while Mr. Smith employed curvilinear 
regression lines. 

None of these changes makes any great difference in so far as the 
present purpose is concerned. The coefficient of multiple correlation 
yielded by this four variable problem is .8249 and the coefficient of 
total determination is .6805. The coefficients of net determination 
are as follows: relative price of cotton, .1050; acreage change in pre- 
ceding year, .0142; time, .5613. Here again we secure astounding 
results. These measures indicate that the passage of time is over five 
1 Factors Affecting the Price of Cotton, p. 21. 
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times as important as the pre-season price of cotton in accounting for 
the acreage of cotton, and over 39 times as important as the acreage 
change in the preceding year. 

Common observation and statistical measurement show us that there 
is a fairly close relation between changes in cotton price and changes 
in acreage, that “the acreage in cotton has usually been reduced when 
prices in January were lower than in the preceding January and in- 
creased when prices were higher.”! Yet our analysis shows that price 
is relatively unimportant when compared with the passage of time. 

We may next consider why it is that these mathematically correct 
measures have yielded results which are clearly unsound from an 
economic standpoint, if we seek to apply them to the current situation.” 
The answer is simple: the variables were such that the absolute magni- 
tudes of the less important factors obscured the important year-to-year 
fluctuations in the significant factors. In the first example, as would 
be expected, the trend of the price of cotton is substantially the same 
as that of the general price level. Both series rise from low points in 
1902 to high levels in 1920 and both decline to low points in 1921 and 
again rise to fairly high levels thereafter. 

It would be extraordinary if this were not true. The significant 
relationship between the year-to-year variations in cotton price and 
the year-to-year variations in supply is obscured by the close correla- 
tion between the trends of cotton price and the general price level. 
The absolute levels of these last two named series are many times larger 
than the previously mentioned annual variations. It is substantially 
this relationship which is indicated by the coefficients of net determina- 
tion. The same sort of explanation covers the second example. The 
acreage figures and the series representing time have approximately 
the same trends, both rising from low levels at the beginning of the 
period to high levels at the close. The fluctuations of significance 
from an economic standpoint are the year-to-year variations in the 
pre-season price of cotton and the acreage harvested. It is apparent 
from an examination of the data that there is a direct and fairly close 
relationship between the changes in these series. The fluctuations in 
price, while large, are mathematically overshadowed by the correlation 
between the absolute levels of the series representing acreage and time. 

These two examples furnish evidence of the danger of injecting time 
or the price level as an independent variable instead of the usual method 
of eliminating the trend factor or deflating value series. Instead of 


1 Factors Affecting the Price of Cotton, p. 8. 

* This is not a criticism of Mr. Smith’s publication; the examples and correlations cited in this paper 
form a relatively small portion of his complete study and were not used by Mr. Smith in interpreting the 
current market situation. 
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adjusting for trend or the changes in the general price level by this 
method as suggested by Mr. Smith,! we are more apt to obscure the 
basic economic relationship in which we are interested by a correlation 
of secular trends. 

The cases cited show that such errors may be avoided by computa- 
tion of the coefficients of net determination. If it is countered that this 
subjective method of checking the economic validity of correlation 
results constitutes an abandonment of objectivity, the proper reply is 
that a certain degree of competence in economic logic and market sense 
must be assumed on the part of the investigator. There is no virtue 
in a statistical measure in economic analysis unless its mathematical 
nature is understood and its use is tempered with some knowledge of 
the economic forces represented in the analysis. No science can dis- 
pense with intelligence in the interpretation of its results and it is not 
likely that quantitative economics can be divorced from reason. 

Another important point in the consideration of the coefficient of 
total determination and the coefficients of net determination is that 
these measures are valid only with respect to the variables which have 
been included in the study. For example, we may have included five 
independent variables in an attempt to understand the fluctuations of 
cotton prices and our coefficient of total determination may be 95 per 
cent or better and it may still be true that we have not included the five 
most significant factors. Economic coincidences or high correlation 
between the absent significant factors and those which were included 
may account for the high coefficient. Similarly, a comparison of the 
coefficients of net determination as showing the relative significance of 
the several independents in accounting for the variations in cotton price 
is valid only on the basis of the factors considered. The inclusion of 
another factor may show it to be more significant than any of those 
factors previously measured. 

It scarcely needs stating that in using coefficients of net determi- 
nation, we merely assign relative degrees of responsibility to the several 
series of numeric representations of economic phenomena (the inde- 
pendent variables) which we have used in attempting to account for a 
single series of numeric representations of economic phenomena (the 
dependent variable). Whether the economic phenomena themselves 
are causally related in the degrees indicated by the coefficients or in any 
other degree is quite another problem on which our coefficients have 
shed no light whatsoever. 

From what has been said, it will be clear that high coefficients of 


1 “Error in Eliminating Secular Trend and Seasonal Variation Before Correlating Time Series,”’ this 
JouRNAL, 20: 543-545, December, 1925. 
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multiple correlation between sets of raw data should generally be dis- 
counted very heavily and accepted only after the most searching 
scrutiny. The coefficients of net determination afford, it seems to me, 
the simplest method of instituting this analysis. Their calculation is 
an extremely easy task, since all the values required will have been 
computed in determining the usual correlation measures. A few 
minutes of labor and a little judgment will frequently enable one to 
avoid implied conclusions in quantitative economics which no one 
would support when stated directly. 
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STATISTICAL ASSOCIATION ! 
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By Sruart A. Rice anp Morris GREEN, University of Pennsylvania 


In the September, 1929, number of this JouRNAL the present writers 
presented data showing the extent to which the 1928 membership of 
four social science societies overlapped. At the suggestion of Pro- 
fessor E. B. Wilson we have made a further analysis of the membership 
of the American Statistical Association. Our object has been to obtain 
the best estimate possible of the extent to which the membership of the 
Statistical Association is drawn from each of a variety of fields of , 
interest which employ statistical methods. We conceive this question 
to be of importance for the Association because of the numerous evi- 
dences that it has largely become an economic and business statistical | 
association, rather than an organization which gathers together the : 
personnel and interests of all those who use statistics. | 

It has been impossible to obtain more than an approximate distribu- 
tion of the membership by fields of interest. There is no sure index to 
the latter. Moreover our data are not as complete as we would have ) 
wished, especially because of the absence of comparisons with member- 
ship lists of organizations directly representing the fields of psychology 
and education. In the following tables we have omitted all institu- 
tional (non-individual) memberships and all those from foreign lands 
other than Canada and Mexico. 

We have proceeded, first, by comparing the 1928 membership list of 
the American Statistical Association (in its “Handbook’’) with lists of 
a number of other organizations as of the same year. Part of our data 
were taken from the note in the September, 1929, JourRNAL previously 
mentioned. In the case of a number of the fields of interest considered, 
we have depended upon section membership in the American Associa- 
tion for the Advancement of Science, taking account only of the first 
section checked by the member as indicating his major interest. Among 
1,787 members of the Statistical Association we were able to identify 
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1 We have made no specific mention in this note of W. I. King’s “Classification of Members of Ameri- 
ean Statistical Association on Basis of Duties and Interests,’’ this Jounnat, Vol. XXII, No. 158, June, 
1927, pp. 224-6. Dr. King’s results were procured by means of a questionnaire and provide inter- 
esting comparison, but not statistical comparability, with our results herewith. 
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1,000 additional affiliations with other organizations, maintained by 
727 members. This distribution appears in Tables I and II. 

In the case of 1,060 members, who are without other affiliations 
among the organizations to which we have given attention, we next 
sought indications of interest or affiliation in the mailing address given 
in the Handbook. From this source we were able to classify the field of 
interest for an additional 510 members. This distribution appears in 
Table III. It is our opinion that the 550 members of the Association 
remaining undistributed by either method of analysis would fall into 
the “economic and business” category in larger proportions than do 
the 510 members distributed in Table III. This belief is based in part 
upon the fact that a very high proportion of the 550 unidentified mem- 
bers give addresses in what appear to be commercial office buildings. 
We reason further that members connected with university or govern- 
ment departments were more easily identified, and hence more likely 
to have been distributed in the first instance, than those connected 
with business firms. 

In Table IV we have attempted to assemble our data in such a way 
as to show the comparative extent of collateral interest among 1,202 
of the 1,787 individual members of the Association. ‘The consolida- 
tions effected in this table are in most instances dependent upon per- 
sonal judgment. For instance in the category “ political science” we 
have included all of the 67 members included in Table III as connected 
with the public service in any capacity. Again we have included in the 
category “economics and business’ sub-categories of classes in Table 
III (not shown therein) relating to agriculture, because of the fact, 
which became apparent in the process of tabulation, that the agri- 
cultural interest within the membership was largely that of agricultural 
economics. The outstanding conclusion that we are able to draw from 
Table IV is a confirmation of the overwhelming preponderance of the 
economic and business interest in the Association. The category 
which we have labeled “sociology and social welfare’’ is much larger 
than we had anticipated from the earlier analysis, and appears to rate 
second in importance. It is, nevertheless, exceeded by the economic 
and business interest in a ratio of nearly six to one. Moreover the 
category “sociology and social welfare’ is perhaps in reality more 
widely diversified than are the other categories. Another field of 
interest which is proportionately larger than our earlier data tended to 
suggest, is that of actuarial science. The total overlapping of member- 
ship with the American Association for the Advancement of Science is 
also unexpectedly large. However, it must be again recalled that this 
latter organization itself contains the widest variety of interests. 
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Memberships of the following organizations and sections were 


used in the study: 


. American Sociological Society 

. American Statistical Association 
American Economic Association 

. American Political Science Association 
. American Mathematical Society 

. Mathematical Association of America 
American Institute of Actuaries 

. Actuarial Society of America 


WON AAP wh 


(a) Mathematics 

(b) Physics 

(c) Chemistry 

(d) Astronomy 

(e) Geology and Geography 
(f) Zoological Sciences 

(g) Botanical Sciences 

(h) Anthropology 

(i) Psychology 

(k) Social and Economic Sciences 
(1) History 

(m) Engineering 

(n) Medical 

(o) Agriculture 

(q) Education 

(x) No interest specified 


TABLE I 


. American Association for the Advancement of Science: 


AFFILIATIONS WITH SPECIFIED ORGANIZATIONS OF 727 INDIVIDUAL 


MEMBERS OF THE AMERICAN STATISTICAL 


Gustave of Guplieations 











Organization 





. American Sociological Society......... 

American Economic Association. ....... : 

American Political Science Association. ...... 

American Mathem atical Society................ 

Mathematical Association of America........... ae 

- American Institute of Actuaries. ...............2.- ee eeeee 

. Actuarial Society of America.............-- 2. esses ee eee 

. American A ssociation for the Advancement of Science: 
ea oe le a ange weed pa kawse rane 
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(e) Geology and Geography......... et, ee 

(f) Zoological Sciences....... San ene ee ee 
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(k) Social and Economic Sciences.............. av 
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The possible number of combinations among these organizations and 
sections is, of course, extremely large. However, we discovered only 
78 actual combinations.' 

In Table I we have consolidated memberships in collateral organiza- 
tions and sections without respect to plurality of collateral affiliation; 
that is, duplications are included. 


TABLE II 


MEMBERS OF THE AMERICAN STATISTICAL ASSOCIATION BELONGING TO ONLY 
ONE OTHER ORGANIZATION, OR GROUP OF CLOSELY RELATED ORGANI- 
ZATIONS, AMONG THOSE EXAMINED 





























; —e Percentage of the 
Organization Number overlapping membership 
1, American Sociological Society. ..................+-5+ 3 29 39 
3. American Economic Association. ....................455- 401 80 
4. American Political Science Association.............. wes 4 13 
5. American Mathematical Society. ..................-0005: 8 10 
6. Mathematical Association of America.................... 5 10 
7. American Institute of Actuaries.................... peel 12 29 
8. Actuarial Society of America...............-----+2+eee0s 8 18 
9. American Association for Advancement of Science: 
et ne wa ch a AAke eae eee ahGeeen 3 s 
Ce) Se OE COI, . occ ccccscovecsceess 2 t 
ie ME I, oon occ cccuscatccevesesess« 5 b 
Fee RR RRR ae Mae 1 t 
i ete ibce ined an siabeadkeqeann oe 19 z 
(k) Social and Economic Sciences. ..............-. 15 32 
ec caine ahh ehecn ah waawk Rie ane 7 ; 
Te a ho gaan is la 9 
a ois SS Caw oe af 1 t 
I i Er 14 z 
(x) No interest specified.................. oe 79 44 
5, 6, 9a, alone or in combination (mathematics).............. 48 51* 
ee 34 54t 
* On a base of 95. t Not calculated. t On a base of 63 


In Table II is shown the extent of dual membership in the American 
Statistical Association and in other organizations in which, in each case, 
no third affiliation was discovered. This seems to have significance 
because it exhibits what might be called a singularity of interest in 


TABLE III 


VOCATIONAL AFFILIATIONS OF MEMBERS OF THE AMERICAN STATISTICAL ASSO- 
CIATION, NOT AFFILIATED WITH OTHER PROFESSIONAL OR SCIENTIFIC 
ORGANIZATIONS SO FAR AS DISCOVERED 








EAC is Be PRR g nts, Senge gn INE Mp Neg US ry SINT Rep 550 
A atd oad oe kk aah dee che ahaa ace eee ae ea ees 295 
I Na ale eat ieee oa Oe i ateseataa lla 107 
I S\ EES REECE AMR ACRE SE IAI SAE TAG RE LEARN a am 67 
Social and Community Agencies..............-.-..-+eeeeeeeeeeees picasa 35 
______ yre pt ua RN cen e ae NS a BALERS 6 
DS Kwa vinden bens 1,060 











1 Space does not permit presenting these in detail here, but we will be glad to supply the data con- 
cerning any particular combination to any interested reader. 
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certain fields, particularly economics, as compared with a diversity of 
interest in certain other directions. Thus if Table II be compared with 
Table I, it is found that of 500 members of the Statistical Association 
also members of the American Economic Association, 401, or 80 per 
cent, belong to these two organizations exclusively. Whereas, of 30 
members belonging to the American Statistical Association and the 
American Political Science Association, only 4, or 13 per cent, belong 
to these two organizations exclusively. 

Table IV, as hitherto noted, represents an attempt to consolidate 
the preceding data for 1,237 among 1,787 individual members. Dupli- 
cations have been eliminated within each category but considerable 
overlapping exists among categories. The question may be asked 
whether the remaining 550 unidentified members may be assumed to be 
distributed in general accordance with the percentages shown in the last 
column of Table IV. We believe that this cannot be assumed; espe- 
cially since we feel, as already pointed out, that those whose interests 
are in business among the 550 unidentified members are proportionately 
more numerous than is indicated by these percentages. 


TABLE IV 


INDICATED FIELDS OF INTEREST OF 1,237 AMONG 1,787 INDIVIDUAL AMERICAN 
MEMBERS OF THE AMERICAN STATISTICAL ASSOCIATION * 











Field of interest Number of | Percentage of 

members | 1,237 members 
I a se aula hens wm 923 74.6 
is nce ch eee anes a Re enews ea’ 158 12.8 
ee aes dive weke ahh NeERe eRe deeds wed esesak 98 7.9 
EEG DAE ESE ERE AS ee ne ee eee 95 Be 
EERE PE eC ee Pe ona ene 124 10.0 
i cE aS a ee ewe ais 10 0.8 
Biological Science, including Agriculture....................0-0000: 41 3.3 
a aa Sanaa ire Wain Gib a wh ew aS ww Ole 10 0.8 
EE EE Se ee eee 45 3.6 
es et haere en bh eae a dae h beens aenkssebheeewees 13 1.1 
esa a ence renee nos ete andere ieseaweatt 20 1.6 
a ee aaa aed ee deen deka Corned deka e awe us 37 3.0 
American Association for the Advancement of Science—all sections. . . 178 14.4 














* Duplications exist among the categories enumerated, but duplications within each category have 
been eliminated. Inclusion in or exclusion from given categories of necessity has been arbitrary. 
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NOTE ON INTERPOLATION 


By Jan K. WISNIEWSKI 


Statement of the problem. Given the arithmetic averages 
V1, Ya, - - » Yn Of some variable y in successive intervals %— 21, %1— 22, 
. . « &n-1—2%, to devise an interpolation formula which will yield a 
value of y for any value of z within the limits z—z,. 


Usual method. We center the values of y:, yo, . . . Yn, at the middles 
of the corresponding intervals and then draw through the resulting 
points, P:, Po, ... P,, a curve of a given equation (not shown in 


Chart I) involving n parameters. There are n equations of the follow- 


ing form 
ne) 


wee 





(1) 


ae Sxit®) 
Yn j( Bate , 


Now, when we have determined the parameters of the curve, we may 
observe that the arithmetic average of the values of f(x) in any interval 
Z,-1—2 is not necessarily equal to y,, which is, presumably, an 
undesirable result. It can be easily proved that the precise equality 
takes place only in the case of f(x) being a linear function. 

Proposed method. Instead of equations (1) we write n equations of 
the following form 


(41 — 20) Yi = [sae 


(2— 21) Y2 = [s2)a0 
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(25 teidon= | fla)de 
Geometrically, it is equivalent to drawing a curve satisfying the 
condition that, for each interval, the areas X,—,A;,-.B,X; and 
X,-1K,-:K;,X;, shall be equal. (Chart I). 


CHART I 
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Special case. Assume that the intervals are only three in number 
and that the curve used for interpolation is a quadratic parabola 
y=az’+bx-+c; assume further that the intervals are equal in length, 
30 units each, the origin being taken at the middle of the second inter- 
val. So the conditions are assimilated to those of interpolation in a 
three-months’ period when we do not take account of the varying 
length of months. 


-15 


(—15+45)y, -| (ax?+bar+c)dz= | jar sba!+c2 


5 


lb 


= 3 a(—8375+91125) + 5b(225—2025) +e(— 15+45); 


30y; = 29250a — 9006+ 30c 





a s+ DMD oS he 


— 6 et CH 





77] 

















yi = 975a—30b+c; (3) 
similarly y2=75a +c 
and  y3s=975a+30b+c 


1 1 1 
therefore a = ——(y:—2y2+ ys); b= —(ys—y) ; c= Y2— —(yi— 2y2 t+ ys). 
er aan (yi —2y2+ Ys) $0 Y3s— Yi) Y2 ~ (yi— 2y2+ Ys) 


If we had evaluated the parameters by the usual method (equations 
e 1), the values of a and b would be the same but c would be equal to y2; 
| thus the application of the new method is equivalent in this case to 
shifting the curve upwards or downwards by a constant amount; but 
this holds true only under the qualification that the curve is a parabola 
of second order and the intervals are of equal length. 

Numerical example. Let yi, ys, ys be the mean temperatures of 
June, July and August in Warsaw, Poland, equal to 16.95, 18.40 and 
17.54 degrees centigrade, respectively. The values of the parameters 
are: a= —.0013; b=.01; c(old method) = 18.40; c (new method) = 18.50. 
In Chart II the new method is represented by the solid line. 


CHART II 
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EMPLOYMENT OF MEXICANS IN CHICAGO AND THE CALUMET 
REGION 


One of the significant aspects of the Mexican migration to the United States 
is the appearance of these immigrants in industrial employment in the North. 


TABLE I 


NUMBER AND PERCENTAGE OF MEXICANS EMPLOYED IN MAINTENANCE OF WAY 
WORK ON 16 RAILROADS IN THE CHICAGO-GARY REGION, 1916-1928 











ese Mexican Total Per cent 
employees employees Mexican 

a a a a ka Biale —-  - # £¢keee 
le a a a a ar ag kd Sele ee Raa a ie J a 
Minh antdhd bit Gund ealgekheeeenee wewe Sabena en ben (eg 
a a a ai a eae i a as la ae i. aes 
a ee i ale Sn —l a ee 
ee ee ed ak ea aie ne kaa bee eel ee gaa 
a a a at la ts a ee oe 1 fee 
ee ee ee eeu ala ebed aaa alae se 2,181 9,978 21.9 
a a ae lal 2,978 9,516 31 
Na a er gl en aha era 3,710 12,404 29.9 
ERPS Re eee Cet ae eee oe eee a TO 5,255 12,987 40.5 
cca ce a i a ae a a ae ai tien enae aaa 4,284 10,244 41.8 
tins bia at ainda bine ease tars bbe ha aen dade ean a 3,963 9,238 42.9 

















Table I shows historically the employment of Mexicans in Chicago and im- 
mediate vicinity for maintenance of way work on l6 railroads. In most cases the 
“Chicago division”’ was the unit of line taken. From a single payroll each year 
(in practically all cases the second half of June) the number of Mexicans was 
obtained by identification of names. In one case, estimates only for years pre- 
ceding 1928 could be obtained. It is believed that this table includes all rail- 
roads within the area employing down to 100 Mexicans in maintenance of way 
work. A sprinkling of Mexicans employed by railroads as freighthandlers, ete. 
are not included. The percentage of Mexicans employed on these railroads in 
1928 ranged from 4 per cent to 80 per cent. 


TABLE II 


NUMBER AND PERCENTAGE OF MEXICANS EMPLOYED IN 15 INDUSTRIAL PLANTS 
IN THE CHICAGO-GARY REGION, 1916-1928 














— Mexican Total Per cent 
employees employees Mexican 
a at aa ain am ed ela lie bana es oa alta a ie wile acai — © weene 
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a ESE RT Seger A ep ee eee oer eae ee Re ae ae 8 “teeee 
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EN RS Ee ERR A See ee ee ee | I ere 
ARERR ES 5 SE ey eee reece gl erring ce aN Sey A Mercere cer > i. ~ecane 
Dts cobhanginenenetne eed hk eteaeke ke oeaneues bees Sf 2 a ; 
NG atid el 1 lain rete Ae ed cheek: ech aie Aart oem 6,052 65,220 9.3 
a a i es 7,269 65,888 11.0 
Diep attrda th eenned Seta tte bso eRe vewbad ee 6,823 64,189 10.6 
nein pi dd eeehih he tk eeendeawsweneanraeeeeeknne ,08 65,682 
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Table II shows historically the employment of Mexicans in 15 industrial plants 
in this region. It is believed that only 2 plants which employed 100 or more 
Mexicans in 1928 are not included here; in that year they employed about 700 
Mexicans. 

In Table II are included 5 meat packing plants, 7 metal and steel plants, a 
cement plant, a railroad-car repair plant, and a rug factory. The percentage of 
Mexicans employed in these plants in 1928 ranged from 3.7 per cent to 29.9 per 
cent. The figures for the table were secured partly from nationality reports, 
partly by identification of names on payrolls. Unavoidably, they do not all 
represent the same dates within each year nor identical bases of computation. It 
is believed that figures for 1929 would show a small increase in number of Mexi- 


cans over 1928. 
Paut 8. Taylor 


University of California 


IS THERE A BUSINESS CYCLE? 


A dinner meeting of the American Statistical Association was held on Febru- 
ary 13, 1930, at the Aldine Club, 200 Fifth Avenue, New York City. 

Dr. Edwin B. Wilson, President of the Social Science Research Council, pre- 
sided. 

The first speaker of the evening was Colonel Malcolm C. Rorty, President of 
the American Statistical Association. He expressed the opinion that business 
does not fluctuate at regular intervals of time, but that the business cycle must 
be regarded rather as an irregular series of up and down movements. The 
present tendency is for these movements to become somewhat less in amplitude 
than was formerly the case. Through statistical surveys, the larger business en- 
terprises are now able to obtain sufficient information concerning the economic 
situation to assist them materially in stabilizing their respective lines of business. 

Statisticians are usually prone to think of business as fluctuating about a line 
called “normal.” This convept is perhaps not as accurate as it should be. It 
is better to think of business as departing from a line representing full-time em- 
ployment of those normally engaged in gainful occupations. The aim of our eco- 
nomic organization should be to keep industry as near this line as possible. It 
departs from the line primarily because industry is always changing in nature. 
These changes are usually due to such a multitude of causes that they approxi- 
mate pure chance. For this reason it is extremely difficult to forecast them. A 
change which is the result of major causes can, however, often be predicted 
with a reasonable degree of certainty. 

The second speaker of the evening was Dr. Warren M. Persons of The Gold- 
man Sachs Trading Corporation. He called attention to the fact that our 
knowledge of business cycles has mainly been gained from observation rather 
than from deduction based upon economic theory. The various theories of 
business fluctuations, he said, are of little use in the practical problem of busi- 
ness forecasting. 
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Statisticians may be divided into two broad classes. The first class, consisting 
of men like Oscar Anderson and Arne Fisher, are fundamentalists. The funda- 
mentalists believe in a trinity: mathematics, probability, correlation. They are 
more interested in nice mathematical processes than in an examination of 
premises. They believe in absolutes. Their point of view is mechanistic. 
The second class, consisting of such men as Bowley, Keynes, Allyn Young, and 
E. B. Wilson, are skeptics. The skeptics include among their number many of 
the greatest mathematical statisticians. They conceive mathematical statistics 
to be similar to mathematical physics in which, to obtain realistic results, prem- 
ises must be subjected to the closest scrutiny. They value mathematics as an 
instrument, but they are not hypnotized by mathematical symbols and they do 
not stand in awe of the tool they use. They do not believe in iron clad laws or 
mechanistic interpretations in economics. They suspect, for instance, the valid- 
ity of correlation coefficients derived from economic data by involved processes. 

In the study of business cycles we are hampered by the lack of definiteness of 
the terminology currently employed. We have inherited the terms “crises” 
and “depression.” It would be well if we substituted for these vague terms, de- 
scriptive of the status of business, other terms descriptive of the direction of 
movement of business. Thus, let us speak of ‘business expansion”’ and “ busi- 
ness recession’ when referring to the cyclical movements of business. 

The third speaker on the program was Professor Irving Fisher, of Yale Uni- 
versity. The subject of his address was “Cycles as Facts and Tendencies.” He 
pointed out that it is incorrect to speak of “the” business cycle, for, in truth, 
there are a whole series of business cycles going on at the same time. A sharp 
downward movement of business may be the result of two or three cyclical forces 
all acting in the same direction at the same time and their combined effect may 
be reinforced or mitigated by secular or chance forces. 

There are two kinds of truth: first, historical truth or fact, second, scientific 
truth or tendency. When we say that, if A is true, then B is true, we are dealing 
with scieutific truth. The conditional clause is never put into a statement of 
historical truth. 

Most historical facts are the result of forces too numerous to be described. 
The essence of science is simplification. The scientist eliminates either physical 
or mental extraneous forces and deals with a single force or a specific number. 
Very commonly, the force can never be isolated in practice. Thus we can find no 
actual example illustrating perfectly, even for a single second, Newton’s first law, 
that a body in motion tends to move in a straight line at uniform velocity. 

The cycle must be thought of as a tendency, not as a fact. Recently, statis- 
ticians have been inclined to give up the idea of periodic oscillations, and to talk 
of the cycle as merely a series of sequences—sometimes merely as an up and down 
movement. The fact is, however, that a mere up and down movement of busi- 
ness relatively to any average cannot logically be referred to as a cycle; for any 
statistical series—even a record of a gambler’s balance of gain or loss at Monte 
Carlo—will necessarily oscillate up and down relatively to its own average. In 
any true cycle, the pattern of one wave must in some significant sense repeat it- 
self over and over again. It is probable that tendencies to do this are actually 
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existent, but that these cyclical tendencies are constantly being interfered with 
by each other and by secular and accidental forces. 

We know that, in the field of mechanics, cyclical motions do not keep going 
unless there is a persistent, guiding force; for example, it takes a spring or weight 
to keep the pendulum in motion. The same principle holds in the field of eco- 
nomics and business. 

Among the best understood cycles are the daily cycle and the yearly cycle. 
Strangely enough, these are not usually referred to as cycles, the term being 
more commonly reserved for the more or less irregular ups and downs of business 
and industry. 

Perhaps the most fruitful field of investigation for the statistician, is to study 
the causes of these ups and downs. 

Professor Fisher pointed out that his investigations had revealed the following 
relationships: (1) That the sluggishness of adjustments in the rate of interest 
tends to cause business fluctuations; (2) that price changes when shifted forward, 
with a distributed lag, correspond closely to the fluctuations of business; and 
(3) also to fluctuations in the rate of interest. This has proved to be true both 
in the United States and in Great Britain. 

Dr. Wesley C. Mitchell, of the National Bureau of Economic Research, dis- 
cussed the remarks of the preceding speakers. He stated that, though not dis- 
senting from anything Professor Fisher had said, he found it convenient to apply 
the term ‘business cycle” to historical waves of business activity, marked off 
from each other by successive revivals or by successive recessions. 

Dr. Mitchell pointed out that recent studies by Dr. Simon Kuznets have re- 
vealed interesting “long cycles” and have shown the relationships of trends 
thereto. He suggested that current work upon business cycles is developing 
into intensive study of all the recognized types of business fluctuations. Each 
of these types presents its own claim to attention and its own need of a distinctive 
name. 

At present the National Bureau is making an intensive study of the cyclical 
behavior of all the economic activities of which it can find a satisfactory statistical 
record. For that purpose it has devised a special technique of time-series analy- 
sis which it is applying to American, British, French and German data. The re- 
sults promise much of interest, but it is not yet possible to say just what they 
may teach us. Such detailed studies of the similarities and varieties of cyclical 
movements seem to Dr. Mitchell a desirable preparation for further efforts at 
explaining business cycles. The time-honored sequence in economic inquiry 
is to explain phenomena first, and scrutinize them afterward. The National 
Bureau is reversing this sequence in its studies of business cycles. 

After a very brief discussion from the floor, the meeting adjourned. 
Wittrorp I. Kine 
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RAILROADS AND AMERICAN PROSPERITY 


A dinner meeting of the American Statistical Association was held at the 
Aldine Club, 200 Fifth Avenue, on Wednesday evening, March 19, at 6 o’clock. 
About one hundred and sixty-five persons were present. 

The presiding officer was Mr. Roy V. Wright, Managing Editor of the Rail- 
way Age. He performed his duties with great success. 

The first speaker of the evening was Mr. J. M. Fitzgerald, of the Eastern 
Presidents Conference. He spoke on the topic ‘‘Transportation—Past, Present, 
Future.” He brought out, by numerous illustrations, the difficulties which the 
railways have had to overcome before arriving at their present position. He 
called attention, for example, to the fact that the first railroads paralleling the 
Erie Canal were allowed to haul freight only during that part of the winter when 
the Erie Canal was frozen. The different states were anxious to prevent freight 
from flowing freely from one state to another and, hence, as late as 1850, the 
railroad lines in New York, New Jersey, and Pennsylvania, all had different 
gauges. Interstate traffic on railways has been hampered by divergent legal 
regulations in different states. As late as 1917 we find many different state 
laws governing the type of headlight used on steam locomotives, and a headlight 
which might be legal in one state could not be legally used in another state. 

While railways may not have welcomed regulation at the start, certainly no 
one will now deny that a governmental supervision of railways is in the public 
interest—provided that regulation is kept free from political influence. 

Since the War, expansion in railway service has taken place mainly through 
improving the quality of railways rather than by increasing their mileage. Rail- 
way managements have been exerting themselves strenuously to cut costs, but, 
at the same time, to render better service than ever. In both lines, they have 
been surprisingly successful. That such is the case is well demonstrated by the 
fact that car shortages have practically disappeared and freight schedules have 
become far more dependable than formerly. 

The second speaker of the evening was Mr. Walter Case, of Case, Pomeroy, 
and Company. He first discussed the capital structure of the railway system, 
pointing out that, at present, the railways of the United States, when all are 
taken as a unit, have eleven billion dollars of debt, two billion dollars of pre- 
ferred stock, and six billion dollars of common stock. Mr. Case felt that, during 
the next five years, the railroads will probably require about eight hundred 
million dollars of new capital annually. Approximately half of this will, pre- 
sumably, be derived from the internal savings of the system. According to 
sound business principles, one-half of the remainder, or about two hundred 
million dollars per annum should be obtained through the sale of common stock 
and the other half, or two hundred million dollars, from bonds. This is a much 
larger proportion than has actually prevailed during the last eight years. As a 
matter of fact, in no year of the period just mentioned, has more than 17 per 
cent of the new capital been raised by the sale of common stock. The era of 
financing railroads in this way practically terminated in 1910. After that, the 
net income of the railways declined, and it was hard to float common stock 
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issues. Recently, however, net earnings have been improved once more and it 
therefore appears possible again to finance on a common stock basis. 

Mr. Case next discussed the income of the railways when considered as a 
system. He pointed out that the chief revenues come from the hauling of pas- 
sengers and freight. Passenger revenue has been falling off for several years 
past. The growth of ton mileage has been slowing up, having increased less 
than two per cent per annum during the last decade as against five per cent per 
annum during the earlier periods. This decline in ton mileage of freight has 
not been due primarily to the competition of auto trucks, but has been caused 
largely by the fact that power is now being transmitted in new ways. Petroleum 
is being sent through pipe lines and electricity along wires. Both of these new 
methods of transportation have limited the necessity of hauling coal, and coal 
has been the most important item of railway freight. 

What is the outlook for railway net earnings in the future? It appears likely 
that coal shipments will continue to decline. The chances are, then, that gross 
earnings will increase but slowly. Can expenses be reduced? Mr. Case felt 
the chance of cutting the wage bill to be very slight. Taxes keep growing. 
There is some probability, however, that improvements in the locomotive will 
enable the railways to reduce the fuel bill by increasing steam pressure. The 
probabilities are, however, that the fuel bill cannot be cut more than one-third 
at the outside. 

If the railroads are to maintain their present earning power, it is essential that 
their rates not be reduced materially. The most dangerous feature in this re- 
spect is the possibility that the government may build or subsidize canals which 
would necessarily force the railroads to reduce rates, for freight hauled over gov- 
ernment canals would probably pay nothing whatever for interest on the in- 
vestment. 

The third speaker of the evening was Judge R. W. Barrett, Vice-President and 
General Counsel of the Lehigh Valley Railroad Company. He discussed three 
phases of the Transportation Act of 1920, viz: Standard Return, Recapture, and 
Consolidation sections. He pointed out that the Interstate Commerce Com- 
mission had fixed 534 per cent as a fair return on the value of carrier property 
used for transportation purposes and called attention to the fact that the Act 
in no sense guarantees that any railroad will earn 534 per cent. He then re- 
ferred to the Recapture section and stated that, while the government does not 
guarantee that any railroad will earn 534 per cent, or any other per cent, on its 
property, no portion of what it does earn will be taken away from it under the 
Recapture clause until the amount is equal to 6 per cent on its property used for 
transportation purposes. After that percentage has been reached, under the 
Recapture clause, the government will take one-half of the excess over 6 per cent. 
The funds thus taken by the government are to be used for purchasing equipment 
to be leased to and for making loans to weak roads. The reason the Recapture 
clause provides that only one-half of the percentage over 6 per cent shall be re- 
served for the government, as stated by Senator Cummins, is that if all the sur- 
plus were taken, railway officials would see to it that there is no surplus to take. 
The provisions of the Recapture section are difficult to enforce since they depend 
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upon making a valuation of the properties of the carriers. The Interstate Com- 
merce Commission is, according to Commissioner Lewis, now preparing to start 
recapture proceedings against one hundred and sixty-one railroads, and the 
amount involved is somewhat in excess of $150,000,000. The government has 
collected to date $8,607,128.51. Until recent years the policy of the govern- 
ment has been to prevent the consolidation of railroads that would in any man- 
ner decrease competition. The Consolidation sections of the Transportation 
Act of 1920 reversed this policy. By the Consolidation section Congress en- 
deavored to provide a general consolidation scheme which would result in twenty 
or twenty-five systems, with all systems able to earn approximately the same re- 
turn on capital investment in transportation service, and with competition pre- 
served in so far as possible. Very little has been accomplished under the Con- 
solidation section. This form of consolidation is artificial. Railroads have 
been consolidating ever since their construction began, but heretofore consolida- 
tions have been based upon competitive and financial reasons rather than upon 
governmental action. The advocates of consolidation contend that it will result 
in great economies in operation. Great Britain consolidated her railroads a 
number of years ago on the theory that she would save $125,000,000 a year, and 
continue competition. However, no economies of any importance have been 
effected and a great deal of competition has disappeared. 

The last regular speaker on the program was Dr. David Friday, of A. G. 
Becker and Company. He began by showing that the value of railroads which 
the Interstate Commerce Commission has attempted to determine has little 
connection with the economist’s idea of value. The whole object of this valua- 
tion is to limit the profits of railways. 

Practically, railway presidents tend to vie with each other in showing what a 
large proportion of net earnings they can utilize in improving the condition of 
the road. When they do this, the Interstate Commerce Commission is likely 
to say that, since railroads are able to finance themselves by internal saving, it is 
unnecessary to allow higher rates to enable them to secure new capital in the 
market. 

Whenever a commission regulates a railway rate, such regulation constitutes 
limitation of the right of private property. The recent tendency on the part of 
the Supreme Court is to be more liberal with the railways and to allow them to 
participate in the general prosperity; in other words, to consider the railroads as 
being more in the nature of competitive concerns than was formerly believed to 
be the case. The truth is that railroads are not nearly so monopolistic as is 
commonly supposed. Clear evidence of this fact is given by the reduction of 
passenger traffic brought about by the increasing use of automobiles. Now 
trucks are proceeding to be substituted for freight trains in carrying freight. 
Furthermore, in many sections there is decided competition between different 
railroad lines. In determining what is a fair rate to allow railroads on a com- 
petitive basis, the Supreme Court should take into consideration the earnings of 
competitive concerns. In most lines, they do not have records of profits going 
back for a considerable period of years. The National Banks furnish one excep- 
tion. A study of reports for the past shows that, in the 70’s, the earnings of all 
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National Banks were 8.4 per cent. During the next decade, the banks of the 
system earned 8.2 per cent. In the 90’s, they fell to 6.5 per cent, but increased 
in the next decade to 9.5 per cent. On this basis, it appears that the allowance 
of 534 per cent for railway profits is entirely too low. 

The discussion of the addresses of the evening was led by Mr. Glenn G. Munn, 
of Paine, Webber, and Company. He expressed the view that the railroads have 
been so completely cowed by legislation and regulation that they do not now even 
dare to ask fora fair return. Railroad rates are still being reduced little by little. 
On the other hand, the American Telephone and Telegraph Company maintains 
its rates on a relatively high level by bringing suit whenever its rate structure is 
threatened. If the railroads had more spirit, they might well be able to do the 
same. 

There was considerable discussion from the floor, one inquirer desiring to 
know whether or not the electrification of railways would not result in great 
economy in the next few years. In reply, Judge Barrett stated that electrifica- 
tion is an extremely expensive improvement, and, as a rule, is not profitable ex- 
cept on the lines where the traffic is dense. 

In reply to the following question by Mr. Munn: “If the curve of freight traffic 
is flattening out and passenger revenues continue to decline as in the past, what 
is the necessity for appropriating $800,000,000 per year for capital expenditures 
which is an amount somewhat above the average annual proportion in the last 
decade?’’, Mr. Case pointed out that all sorts of improvements necessary to cut 
operating costs require heavy capital investments. He also emphasized the fact 
that the railroads are compelled to put much money into unproductive improve- 
ments such as the elimination of grade crossings. However, even though an 
enormous amount of money is being continually expended for this purpose, no 
progress is being made toward the elimination of the grade crossings, for each 
year more new ones are established than are eliminated by the improvements. 

The meeting adjourned. Wutrorp I. Kine 


MISCELLANEOUS NOTES 


The San Francisco Chapter Activities.—The San Francisco Chapter met in Decem- 
ber to elect officers for 1930 and to discuss “Correlation Analysis in Business and 
Economic Research.” Forty-nine persons were seated at the dinner table. 

The first speaker, Mrs. Ruth McChesney Howe, of the Division of Agricultural 
Economics, University of California, discussed “ Advantages of Correlation Methods.” 
She emphasized the usefulness of correlation method in analyzing data for forecasting 
purposes and held that it is peculiarly adapted to price studies covering specific 
commodities. If basic conditions do not change seriously, it is safe to assume that 
relationships observed in the past will hold for at least a short period of time in the 
future and thus form the basis for a trustworthy forecast. Correlation methods 
may also be adapted for use in price studies and in studies of other kinds of data in 
which an attempt is made to assign to each of the independent variables its separate 
effect upon the dependent variable that is being estimated. In using correlation 
analysis for this purpose direct causal relations are not in any way proved, but by a 
process of reasoning, causation may be inferred from the associations observed. 
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Complete correlation analysis usually requires three types of constants in order 
adequately to describe related facts. These are constants which measure: (1) the 
nature of the relationship, i.e. regression coefficients; (2) the degree of the relation- 
ship, i.e. coefficients of correlation and determination; (3) the discrepancy of the 
relationship, t.e. the standard error of estimate. Problems involving non-linear rela- 
tionships require a modification of the data. Such modification may be accom- 
plished by employing either the logarithms, the reciprocals or various powers of 
the data involved. 

Mrs. Howe illustrated her talk with charts showing some of the principles involved 
both in simple and multiple correlation, and presented an application of the graphic 
method of curvilinear correlation analysis. 

The second speaker of the evening was Professor Theodore O. Yntema, of the 
Graduate School of Business Administration, Stanford University. Professor 
Yntema discussed “ Limitations of Correlation Analysis.”” He mentioned three kinds 
of limitations to correlation analysis: (1) limitations of interpretation; (2) limita- 
tions of sampling; and (3) peculiar difficulties in correlating time series. 

Although the existence of correlation not due to errors of sampling indicates some 
sort of direct or indirect causal relation between the variables, it does not afford evi- 
dence as to the direction, or directions, of causation. The coefficient of correlation 
is a useful measure of relation but it should be remembered that the degree of accuracy 
of estimation attainable is always less than the degree of correlation. 

In considering limitations of sampling the possibility of biassed as well as randcm 
errors should be noted. The contributions of Pearson, “Student,” and R. A. Fisher 
have done much to give precision to the random sampling limitations of small samples. 
Attention was called to the danger of applying the usual methods to gauging the re- 
liability of an hypothesis suggested by the data. 

It was indicated that the peculiar difficulties in the correlation of time series may 
be attributed chiefly to the organic connection between successive observations, the 
difficulties in isolating the fluctuations to be correlated, and the liability of the factors 
causing the fluctuations to change during the period studied and in the future. 

At the conclusion of Professor Yntema’s remarks a considerable number of those 
present participated in a vigorous discussion of the ideas presented in the two main 
papers. 

The election of officers for 1930 installed Mr. M. K. Bennett of the Food Research 
Institute, Stanford, as Secretary-Treasurer; Mr. H. D. Gidney, Pacific Telephone 
and Telegraph Company, San Francisco, as Vice-President, and Mr. Oliver P. Wheeler, 
Federal Reserve Bank, as President. 


At a dinner meeting on February 27, 1930, Dr. Frank M. Surface, Assistant Director 
‘ of the United States Bureau of Foreign and Domestic Commerce, addressed the San 
Francisco Chapter on the subject, ‘‘The Cost of Distribution.’”’ Dr. Surface ex- 
plained and illustrated the work of his Bureau in collecting and analyzing the costs 
of handling individual items or of performing individual services in wholesale and 
retail establishments. The evidence presented regarding (a) the extraordinarily large 
number of articles commonly handled at a loss, (b) the wide extent of territory un- 
profitably canvassed by salesmen of some wholesale houses, and (c) the large propor- 
tion of retail establishments withdrawing from business in the course of a year gave 
rise to interested discussion. 


At a dinner meeting on March 26, 1930, Dr. Norman J. Silberling of the University 
of California, Director of the Silberling Research Corporation, Ltd., addressed the 
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San Francisco Chapter on “Methods of Business Forecasting.” Dr. Silberling’s 
address consisted of a descriptive and analytical comparison of the methods used by 
the Harvard Economic Society, the Standard Statistics Corporation, and the Silber- 
ling Research Corporation in forecasting cyclical movements of business activity in 
the United States. Emphasis was placed upon the changes in method which the 
several organizations have adopted in the course of their activity. In general, these 
changes tend to show that at present mechanical forecasting devices—statistical leads 
and lags—are regarded as less satisfactory than they were some years ago, and that 
supplementary evidence both qualitative and quantitative has been given increasing 
weight in the formulation of forecasts. Dr. Silberling also pointed out that the 
approach to the problem has differed, especially between the Harvard Economic 
Society and the Standard Statistics Corporation. The former has tended primarily 
to forecast the swings in general business activity, and secondarily the movements in 
component industrial groups, whereas the latter has tended primarily to anticipate 
developments in various groups, and secondarily the general movement. 

Washington Statistical Society—The Washington Statistical Society held its first 
meeting of the year on April 3 in the Oak Room of the Raleigh Hotel. The meeting, 
attended by sixty members, had as its topic “Producers’ Response to Price in Various 
Fields” and was presided over by W. M. Steuart, Director of the Census. Papers 
given by L. H. Bean, Department of Agriculture, F. F. Elliott, Bureau of the Census, 
and F. G. Tryon of the Bureau of Mines were discussed by Dr. O. C. Stine and Dr. 
E. Dana Durand. Summaries of these papers follow: 

Dr. Bean spoke on the “Producers’ Response to Prices of Crops.” He said that in 
the case of crop producers it is necessary to distinguish between two general types of 
response, (1) their response to price during the course of a marketing season when the 
crop has been produced and is available for market, and (2) their response to a price 
situation in the following period of production. The first case deals with the amounts 
of a given supply offered for sale when related to the different prices for which those 
amounts are offered, and is closely related to the economists’ supply curve for any 
period of time within a marketing season. It is this upward sloping supply curve 
which the economist usually shows as intersecting the downward sloping demand 
curve to indicate the point at which price tends to be established. The second type 
deals with the relation of a price situation in one season to the farmer’s production 
plans, or acreage changes, in the following season or seasons. 

The first type of crop producers’ response to price may be illustrated with the sales 
of potatoes in the past few years. There is a considerable margin in this crop between 
production and sales, that margin representing quantities retained for food, on the 
farm, for seed, for feed and other purposes. The amounts reserved for these purposes 
and the amounts sold depend to a large extent on the price. 

The second type of producers’ response to price in terms of the following season’s 
acreage may be illustrated by an example taken from a study entitled “ The Farmer’s 
Response to Price.” It is there shown that although there is a large variety of 
elements which control the farmer’s acreage plans, such as the availability of credit, 
and labor supply, weather conditions, crop yields, returns per acre, etc., changes in 
acreage can be related to prices obtained for the two preceding crops and an appar- 
ently high degree of response observed in such crops as cotton, potatoes and sweet 
potatoes, rye, flax, cabbage, strawberries and watermelons. In some cases the prices 
of competing crops were taken into account, for example cotton prices in relation to 
sweet potato acreage, and wheat prices in relation to rye and flax acreage. 


1 See Journal af Farm Economics, July, 1929. 
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The several price-acreage curves appear to bear a general family resemblance, in 
that the degree of response following years of high prices is not so great as in years 
following low prices. In some cases the influence of both high and low prices appeared 
to last for two years; in others low prices only appeared to influence acreage changes 
two years later. The degree of response was found to vary with each commodity and 
for one commodity it was found to vary by regions, although following a general type 
of price-acreage response. 

The validity of these price-acreage curves which are based on post-war data prior to 
1929 is indicated by the fact that they not only give very good estimates of acreage 
changes for the period covered in the analyses, but they also indicated on the whole 
satisfactorily the changes in acreage in 1929. 

“Producers’ Response to Price in the Production of Hogs’ was the subject treated 
by Dr. F. F. Elliott. He maintained that a close relationship between changes in 
price and changes in hog production has long been recognized. Only in recent years, 
however, have there been any serious attempts made to measure this relationship. 

Since the relationships involved are quantitative, methods appropriate for the 
purpose had to be developed before accurate measures could be obtained. The recent 
development in correlation method, particularly in multiple linear and curvilinear 
correlation, has provided a basis of approach to this problem and has given an impetus 
to a number of specific studies. 

Before it is possible to measure the relationship between changes in price and 
changes in production of hogs, it is necessary to eliminate the effect which factors other 
than price have upon production. This involves a complete analysis of all the factors 
affecting production. In a study of the factors affecting hog production in the Chi- 
cago market area, it was found that approximately 96 per cent of the fluctuations from 
year to year in receipts, taken as a measure of the supply, could be accounted for by 
five independent factors. The most important of these factors was the corn-hog 
ratio or relationship between the price of corn and the price of hogs. To measure the 
full influence of this factor, it seemed to be necessary to take the relationship between 
corn and hog prices at different periods. Thus the corn-hog ratio was taken at time 
of breeding, for a six months’ period or more preceding breeding, and for a three months’ 
period following breeding. Approximately 75 per cent of the fluctuations in receipts 
at Chicago from September to April of each year could be accounted for by the rela- 
tionship of corn and hog prices at these three periods. 

In addition to these factors, climatic conditions at farrowing time and trend, were 
the other factors found to be important. 

The amount of change in receipts, associated with given changes in each of these 
factors was shown by net charts, which technically represented the net regression of 
receipts upon each of the factors mentioned after allowing for the effect of the other 
factors considered. 

These results are in line with all studies which have been made of producers’ re- 
sponse to price in the production of agricultural commodities. Farmers characteristi- 
cally seem to be influenced by current or past conditions in making their production 
plans. All hog farmers, however, do not respond in their production to given changes 
in price and other factors in the same way. That is to say, hog: production is much 
more elastic in some states or areas than it is in others. 

Because of variations in physical and climatic conditions, there is frequently 4 
rather wide variation in the size of the corn crop in different parts of the corn belt 
states. This variation in corn production frequently results in significant differences 
in the relationship between corn and hog prices in different regions. Asa consequence, 
farmers in certain regions find it advantageous to increase hogs, due to a favorable 
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corn-hog ratio, while farmers in other areas at the same time find it disadvantageous, 
due to an unfavorable situation. Unless these differences are taken into account, the 
significant responses in production in local areas are not likely to be shown. 

In order to measure these differences in response which farmers make to given 
changes in price in local areas, an analysis of the factors affecting production of hogs 
was made by individual states. Charts showing the relationship between changes 
in the corn-hog ratio (taken at time of breeding and during the period in which the 
previous year’s supply was marketed), and between marketings of hogs in each of the 
corn belt states, were presented. The charts clearly show that the response in mar- 
ketings to given changes in each of these regions was different in different states. 
Hog production, for example, apparently is less elastic in the Eastern corn belt states 
of Ohio, Indiana and Illinois; somewhat more elastic in Iowa, Nebraska and South 
Dakota; and most elastic in Missouri and Kansas. When the analysis is restricted 
still further to type of farming areas in a particular state, the same differences in 
response are to be noted. These variations in response in local areas need to be taken 
into consideration when attempting to forecast the probable total future supply or in 
making adjustments in local areas. 

In his paper on “Producers’ Response to Price in the Mineral Industries,’ Dr. 
F. G. Tryon stated that in mining there are other factors that greatly affect the pro- 
ducer’s response to price, and unless they can be given quantitative expression, it is 
hazardous to forecast the output which a given price will call forth. In soft coal 
mining, for example, there have been ten months in the last decade when the output 
stood at 50,000,000 tons, but the open market price for these ten months has varied 
from $1.83 a ton to $9.51. 

Among the disturbing factors are the following: (1) An external cause, like a car 
shortage, may intervene to prevent the operator from producing as much as he would 
like to. (2) Labor disturbances affect supply profoundly, and their occurrence, ex- 
tent, and duration are scarcely predictable. (3) Wage changes may shift the general 
level of prices to a higher or lower plateau, and while they are influenced by changes 
in general commodity prices and in wages in other industries, they depend primarily 
on the relative bargaining power of the miners and operators and are scarcely pre- 
dictable. (4) Discoveries of new deposits that can be worked at lower costs fre- 
quently upset prices, especially in the metal and petroleum industries. (5) Depletion 
of known deposits tends to prevent their return to former production levels at the 
same price. (6) Rapid improvement in transport may open up rich deposits hitherto 
inaccessible and sharply affect price relations. (7) Technologic change, such as the 
Frasch sulphur process, may suddenly reduce costs and price. 

Price-product relations and especially price-demand relations in the minerals have 
been neglected but would richly repay study. Because of the complicating factors, 
however, statistical methods must be supplemented by methods of the economist and 
engineer. 


The Detroit Meeting.—On March 27 there was a meeting of the Association mem- 
bers in the Detroit area. Dr. C. H. Seehoffer, of the University of Detroit, presided. 
Dr. Margaret Elliott, of the University of Michigan, read a paper on “‘Women in 
Business.” Another paper was presented by Dr. V. P. Timoshenko, of the Uni- 
versity of Michigan, on ‘‘ Agricultural Fluctuations as a Factor in the Business Cycle 
in America.”’ 

Dr. Timoshenko pointed out that there exists great difference of opinions concern- 
ing the réle of agricultural fluctuations in the generation of business cycles in this 
country and that, therefore, further quantitative study is necessary. Such a quan- 
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titative study must demonstrate, first of all, the existence of cycles in agricultural 
production itself. 

The index of the physical volume of leading crops, when corrected for trend and 
smoothed by a two-year moving average, reveals quite clearly cycles in crop produc- 
tion in this country, for the period 1866-1926. The amplitude of the cycles is con- 
siderable (the deviations being as great as 10 per cent on either side of the trend), 
but decreases a little after 1900. There is not strict periodicity, the length of cycles 
fluctuating from three or four years up to eight years. 

The index of total agricultural production (crops and animal production), when 
smoothed, also reveals cycles of about the same character. The index of crop prices 
shows a close negative correlation with the volume of crops (the coefficient of corre- 
lation is —.73 for the period 1866-1914) and reveals cycles which may be completely 
explained by the cycles in the volume of crops. The ratio of agricultural prices to 
industrial prices also reveals certain cycles which correlate negatively with the volume 
of crops. 

This ratio of agricultural prices to industrial prices is an important factor in the 
explanation of the business cycle. A low ratio of agricultural prices to industrial 
prices is favorable to business revival because it creates a larger margin between 
the cost of raw materials and the selling prices of industrial goods. A comparison 
of the ratio with business annals shows that the low ratios always precede business 
revival or, at least, coincide with it. The high ratios occur often during periods of 
high prosperity or financial stringency and immediately precede recession. 

Again, cycles in the physical volume of agricultural production create cycles in 
agricultural exports. The correlation between them was fairly close for the period 
1870-1914. Furthermore, wide fluctuations in agricultural exports dominated the 
fluctuations in the balance of payments and caused an inflow or outflow of gold. 
The coefficient of correlation between these series was about +.70. Large crops were 
accompanied by a considerable increase in the inflow of gold. As a result, the rate 
of increase of money in circulation before the War correlated closely with the 
fluctuations in agricultural exports. The coefficient of correlation between these 
series for the period 1870-1914 was +.65. 

Thus, large crops create a relationship of agricultural prices to industrial prices 
that is favorable for industry, and at the same time bring additional purchasing power 
for industrial products from abroad. It is true that this additional purchasing power 
is not generally accumulated by the farmers, as they often get less for a large crop 
than for a small one, due to the great flexibility of agricultural prices. But it is the 
railroads, distributive organizations, and exporters, for whom these large crops not 
only increase the physical turnover of business, but also the margin between the prices 
which they pay and those at which they sell. 

In conclusion it may be said that agricultural fluctuations are of considerable im- 
portance in the explanation of business cycles in this country, especially for the pre- 
war period. However, fairly close relationship between the agricultural fluctuations 
and business cycle exists also since the War. Even the recession of 1929 may be par- 
tially explained by unfavorable agricultural factors. The crops of 1929 were on the 
average about 10 per cent below those in 1928, farm prices during the fall of 1929 
were more than 10 per cent higher in comparison with the previous year, and agricul- 
tural exports fell considerably during 1929-1930. 


The Pittsburgh Meeting.—On January 15, the Second Annual Dinner of the 
American Statistical Association, Pittsburgh District, was held in the University 
Club when approximately 200 business and professional men in the city turned out to 
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enjoy an excellent program. Mr. James C. Chaplin, President of the Colonial Trust 
Company, Pittsburgh, presided as Toastmaster. 

Mr. M. C. Rorty, President of the American Statistical Association and Vice- 
President of the International Telephone and Telegraph Company, one of the 
principal speakers, gave a thoroughly sound address on the subject “A Business 
Man Looks at Statisticians,’ in which he stressed the importance of the statistician 
in modern business. He briefly reviewed the vast changes which have taken place in 
our conduct of business from the time when one individual had complete control of 
the business and based his policies on his opinions or ofttimes on “hunch,” to the 
present complex structures with thousands interested in the success of a particular 
industry and the control vested in the hands of many. With conditions as they exist 
today it is vital that someone, and preferably a statistician, apply analysis to condi- 
tions not only in that industry but also to factors which affect both that industry and 
business in general. The successful conduct of modern business depends on manage- 
ment having before it the facts and figures relating to business so arranged and 
analyzed as to present a sound basis for future planning and operation. 

Mr. S. L. Andrew, Chief Statistician of the American Telephone and Telegraph 
Company, discussed, under the title of ‘‘A Statistician Looks at Business,” the gen- 
eral misapprehension which is commonly supposed to apply to the term statistician 
and the function of the statistician in unravelling the current business situation. 
Mr. Andrew clearly demonstrated that business, when subjected to the searching 
analysis of the statistician, might reveal an entirely different picture from that which 
has popularly been assumed to be the case. As the first of the year is the customary 
time for business prognosticators and statisticians dealing with business conditions 
to present their views as to the ensuing twelve months the speaker was constrained 
to offer a summary of the present situation. Generally speaking, the natural re- 
sources, the inherent progressiveness of the American people and the present sub- 
stantial state of business tend to minimize any temporary uncertainties in the cur- 
rent business picture. The recent stock market decline has focused attention upon 
the downward tendency of business and perhaps accentuated it somewhat, but Mr. 
Andrew believes that when the readjustment which is now taking place is over, that 
the depression will have been found to be but a slight interruption in the general 
forward tendency for national expansion and growth of business which has been in 
effect since 1922. 


This branch, working in conjunction with the Research and Statistics Committee 
of the Chamber of Commerce, has been active in developing public interest and 
codperation for the 1930 Census. Playlets have been produced in schools, placards 
placed in street cars and on billboards, radio talks have been arranged, and news- 
paper publicity secured. 


Recent Activities of the Chicago Chapter.—On January 22, 1930, Mr. Walter 
Lichtenstein, Executive Secretary, First National Bank of Chicago, addressed a 
dinner meeting of the Chicago Chapter on the subject “Problems in World Banking.” 
Mr. Lichtenstein was General Secretary of the Organization Committee of the Bank 
for International Settlements. His talk dealt with personal observations and im- 
pressions gathered while serving on the Organization Committee and was of a confi- 
dential nature. The attendance was forty. 

At a dinner meeting of the Chicago Chapter on March 19, 1930, Dr. Garfield V. 
Cox, Associate Professor of Business Economics at the University of Chicago, dis- 
cussed informally various problems which arose in appraising business forecasts made 
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by representative American forecasting agencies. Fifty-five members of the Chap- 
ter were present. 

Dr. Cox’s study of business forecasts has recently been published by the School 
of Commerce and Administration of the University of Chicago under the title An 
Appraisal of American Business Forecasts. In his talk Dr. Cox mentioned several 
conclusions which were not covered in the material published. He stated that, al- 
though his study showed that the forecasting services are right considerably oftener 
than they are wrong, they have not increased their accuracy during the last ten 
years. The services, however, are now making more definite forecasts than they made 
in the past. The evaluation of the probable effects of individual factors in the busi- 
ness situation has resulted in better forecasts than the ‘“‘pattern’”’ method of fore- 
casting. Reliance on the effect of the factor of money conditions has caused more 
errors in forecasting than reliance on any other individual factor. 

Dr. Cox stressed the point that in making his study he was interested in appraising 
the success of business forecasting in general rather than in appraising the records of 
individual forecasting agencies. 


Meetings of the Columbus Chapter.—The Columbus Chapter of the Association 
has held two interesting meetings since the beginning of the year. The first of these 
was in January, the topic being, ‘‘A Statistical Program as a Basis for a Quantitative 
Ethics.” Professor Harvey Walker of the Department of Political Science, Ohio 
State University, expressed it as his belief that great improvement in the adminis- 
tration of public affairs would result from linking the educational and research pro- 
gram of the university with the legislative and administrative program of the state. 
Professor Dewey of the College of Commerce discussed some of the difficulties in- 
volved in the development of a statistical program that could be of value to adminis- 
trators. Miss Mary L. Mark of the Department of Sociology contended that the 
careless methods of collecting much of the available social statistics rendered them 
almost worthless so far as the drawing of inferences is concerned. 

In the discussion, an issue was raised between philosophers and statisticians as to 
whether ethical problems could ever be subjected to quantitative treatment. The 
philosophers contended that an ethical standard was an inherent attribute of a rela- 
tionship and the statisticians held to the position that an ethical standard derives its 
significance from its relation to statistical norms. 

At the February meeting, the general topic was ‘“‘Farm Relief: A Legislative or 
Statistical Problem?” Professor Falconer of the College of Agriculture opened the 
discussion by pointing out that the farmer has not succeeded nor will succeed in solv- 
ing his problem by improvement in production alone, for improvement in productive 
methods may lead to a plethora in the market with baneful results. The public, 
he said, needs to recognize that the farmer is as much entitled to legislative support 
of his program as is the business man or labor leader. 

Dr. C. J. West, Director of Research for the Ohio Farm Bureau, raised the question 
as to whether there was a farm problem in any sense different from the city problem, 
viz., the problem of finding profitable employment of his time and resources. Farm- 
ing, he pointed out, is not a single occupation. The producer of vegetable oils is in 
competition with the producer of animal fats, andsoon. The problem is one of com- 
prehending the economic forces at work and basing organization on this under- 
standing. 


The Cleveland Chapter.—The Business Statistics Section of the local American 
Statistical Association chapter has continued to hold its regular monthly meetings. 
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These have been unusually well attended, the attendance being between 20 and 25 
for the last two or three meetings. Subjects discussed have included the business 
outlook for the rest of 1930, particularly in the fourth district; stock market trends; 
and the work of the 1930 census, with particular reference to statistical methods. 


United States Bureau of Labor Statistics.—A field survey of employment, wages, 
prices, living conditions, etc., in the Hawaiian Islands has just been completed. 
The Bureau is authorized by law to make a study of this character every five years, 
but due to lack of funds this is the first which has been made since 1915. 

The work of the Bureau on volume of employment is growing and now covers the 
following industrial groups: manufacturing, mining, quarrying, public utilities, 
trade, hotels, and canning and preserving. The next to be added will be crude- 
petroleum production. Rayon and radios have recently been added to the manufac- 
turing group, and airplanes, paints and varnishes, jewelry, and rubber goods (other 
than tires and boots and shoes) are now being included. 

The labor turnover figures of the Bureau now include rates for a number of separate 
industries—automobiles, boots and shoes, cotton manufacturing, ircn and steel, 
slaughtering and meat packing, sawmills, and foundries and machine shops—in addi- 
tion to figures for the manufacturing group as a whole. More industries will be 
added to this list from time to time. Over 2,500 companies are now reporting to the 
Bureau on labor turnover, more than seven times the number reporting when this 
work was taken over from the Metropolitan Life Insurance Company. 

A general statistical study of all types of codperative associations, except farmers’ 
marketing organizations, is being made and will cover the period 1926-1929. 

A compilation of the labor legislation of 1929 is now in progress and a bulletin 
giving selected court decisions of 1927 and 1928 is in press. 

During the first quarter of the year field agents of the Bureau began collecting 
data on wages and hours in the cigarette, cotton, woolen, and rayon manufacturing 
industries. The gathering of statistics of industrial accidents in the United States 
for 1929 was also begun. 

Data on wages and hours in the cement, airplane, furniture, boot and shoe, and 
slaughtering and meat-packing industries are being compiled for bulletins on the re- 
spective industries. Summary figures on wages and hours in foundries and machine 
shops and in the furniture industry were published in the Monthly Labor Review for 
February and April, 1930, respectively. 


The Forthcoming Meeting of the International Statistical Institute.—A special 
meeting of the International Statistical Institute is to be held at Tokio, Japan, on 
the invitation of the Japanese Government, during the week of September 15-20, 
1930. Professors Walter F. Willcox, Warren M. Persons and Warren 8. Thompson 
are expected to attend and read papers. The English members expected are Professor 
Arthur L. Bowley and Sir William Beveridge. 


International Conference of Agricultural Economics and Farm Management.—An 
International Conference of Agricultural Economists was held at Dartington Hall, 
Totnes, Devon, England, last year. Another conference will be held this year at 
Cornell University, Ithaca, New York, August 18 to 29. 

This conference is primarily of interest to those working in the field of agricultural 
economics, farm management, agricultural marketing, farm credit, farm taxation, 
agricultural prices, agricultural history, codperation, and other allied subjects. 
Many persons from various countries throughout the world will be present and take 
part in the discussions. 
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The programs for the conference will be available about June 1, 1930, and may be 
obtained by writing to Professor Leland Spencer, Department of Agricultura] 
Economics and Farm Management, Cornell University, Ithaca, New York. 


Special Research Fellowships.—The National Tuberculosis Association announces 
a limited number of fellowships in social research as related to tuberculosis, open to 
graduate students who have had special training in statistics, social science or public 
health. Preference will be given to candidates who are interested in pursuing 
research in public health after the completion of this fellowship. 

Researches on topics selected by the National Tuberculosis Association will be 
conducted in collaboration with colleges and universities, and each study will be 
under qualified academic leadership. 

Academic credit may be allowed for this research according to arrangements with 
the individual universities under whose supervision they are undertaken. 

Each Fellow will be required to submit a written report at the completion of his 
fellowship grant and the text of that report shall remain the property of the National 
Tuberculosis Association. Candidates will be considered not alone on academic 
standing, but on experience and general fitness for research work. 

The fellowship grants will date from the beginning of the academic year in the fall 
of 1930. They are for a twelve-month period and the fellowship grant amounts to 
$1,500 for that period with a month’s leave for vacation. 

Interested candidates should write to Jessamine 8. Whitney, Statistician, National 
Tuberculosis Association, 370 Seventh Avenue, New York City, for further information. 


Statistical Instruction at the University of Paris.—Beginning November 4, 1929, 
instruction at the Institute of Statistics of the University of Paris, under the general 
direction of M. Alfred Barriol (Secretary of the Statistical Society of Paris) was offered 
as follows: Elements of Statistical Method, Professors Lucien March and de Bernon- 
ville; Mathematics Applied to Statistics, Professor Georges Darmois; Demography and 
Sanitary Statistics, Professor Michel Huber; Theory of Life Insurance, Professor 
Maurice Hochart (Actuary of La National Vie); Theory of Finance, Professor Alfred 
Barriol; Elements of Mathematical Economics, Professor Jacques Rueff. 

The course is intended for employees of banks and insurance companies. Follow- 
ing the examinations, the Institute of Statistics confers a statistical diploma in the 
name of the University of Paris and this is registered with the Ministry of Public 
Instruction. 


United States Personnel Classification Board.—There is now in press a publica- 
tion prepared by the United States Personnel Classification Board showing a detailed 
classification of some 100,000 positions in the Field Service of the Government as a 
part of its report on a survey authorized by the Act of May 28, 1928. Each class is 
described in a formal statement, known as a class specification, including general 
statements of duties and responsibilities, examples of work performed, and minimum 
qualification requirements. The 100,000 positions are covered in 1,633 classes. 

This report, according to the provision of law, is to be used as a basis for the dis- 
tribution, by the various heads of departments, of the positions under their respective 
jurisdictions to the classes described, for statistical and cost purposes. 

The Board also has under way the preparation of its final report to Congress which 
will be completed after the cost data are assembled. 


United States Government Periodical Mimeographed Statements.—Valuable 
statistical data on business and industry are now being published by the United 
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States Government bureaus in the form of mimeographed sheets and press releases 
in order to give information to the public as speedily as possible. Some statisticians 
and research workers who know what to ask for have had their names placed on mailing 
lists to receive certain releases regularly each week or month, but no inclusive check 
list has ever been made of all these reports that are available. 

The Special Libraries Association, 11 Nisbet Street, Providence, Rhode Island, 
undertook to compile such a check list entitled Descriptive List for use in Acquiring 
and Discarding United States Government Periodical Mimeographed Statements which 
is designed to aid all who are building up files of industrial statistics such as com- 
modity production, consumption, stocks, orders and shipments. 


PERSONAL NOTES 


Professor Robert E. Chaddock is spending his sabbatical leave from Columbia 
University in Europe, June, 1930, to January, 1931, observing the organization and 
handling of population data and vital statistics in a number of large cities, including 
London, Paris, Brussels, Vienna and Berlin. He expects to spend some time in 
Geneva becoming familiar with the work of the League in these fields. 


Mr. Allan Sproul, formerly Assistant Federal Reserve Agent of the Federal Reserve 
Bank of San Francisco, was appointed Assistant Deputy Governor and Secretary of 
the Federal Reserve Bank of New York, effective March 1. 


Oliver P. Wheeler, President of the San Francisco Chapter of the Association, has 
recently been elected an officer of the Federal Reserve Bank of San Francisco. Mr. 
Wheeler is now Assistant Federal Reserve Agent, having been promoted from the 
managership of the Division of Analysis and Research. Mr. N. Merritt Sherman 
succeeds Mr. Wheeler as department head. 


Dr. Emma A. Winslow, recently Research Secretary of the Child Health Demon- 
stration Committee of the Commonwealth Fund, is now associated with Miss Mary 
van Kleeck on studies of the social and economic factors in crime for President 
Hoover’s Commission on Law Enforcement. 


The Commonwealth Fund, under the direction of its Research Statistician, Mary A. 
Clark, has completed the development of a standard system for recording and report- 
ing the operations of child guidance clinics. The system is described in a Manual 
which is now in the process of publication. 


Mr. Howard R. Tolley has resigned his position as Assistant Chief of the Bureau of 
Agricultural Economics of the United States Department of Agriculture, to become 
Assistant Director of the Giannini Foundation, and Professor of Agricultural Eco- 
nomics at the California College of Agriculture. 


Dr. M. J. B. Ezekiel, formerly Senior Agricultural Economist of the Bureau of 
Agricultural Economics, has recently assumed his new duties as Assistant Chief 
Economist of the Federal Farm Board. 
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Committee on Price Statistics 
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Committee on Statistical Monographs 
Edwin B. Wilson 
Malcolm C. Rorty 
Harry C. Carver 

Committee on Governmental Labor Statistics 
Meredith B. Givens 


MEMBERS ADDED SINCE MARCH, 1930 


Aldrich, Laurence W., Electric Bond and Share Company, 2 Rector Street, New 
York City 

Allendorf, Ruth, Business Research Bureau, Metropolitan Life Insurance Company, 
New York City 

Andrews, M. E., Domestic Commerce Division, Bureau of Foreign and Domestic 
Commerce, Washington, D. C. 

Atwood, Albert W., Saturday Evening Post, Philadelphia, Pa. 

Ayres, Horace C., University of Washington, Seattle, Wash. 

Baker, George A., Milbank Memorial Fund, 49 Wall Street, New York City 

Barr, E. C., Room 811, John Jay Hall, Columbia University, New York City 

Baxter, Elizabeth, Librarian, Haskins and Sells, 15 Broad Street, New York City 

Beede, Kenneth C., Shaw, Loomis and Sayles, 24 Federal Street, Boston, Mass. 

Bergman, W. G., Department of Research, Detroit Public Schools, 1354 Broadway, 
Detroit, Mich. 

Biedrzycki, Anton, Standard Accident Insurance Company, Detroit, Mich. 

Bliss, Charles A., 530 Riverside Drive, New York City 

Boyce, Charles W., American Paper and Pulp Association, 18 East 41 Street, New 
York City 

Brittan, Albert K., Harris Upham and Company, 11 Wall Street, New York City 

Broeker, Milton M., 431 East Franklin Avenue, Naperville, Ill. 

Bruckner, Harold, The Equitable Corporation of New York, 11 Broad Street, New 
York City 

Buckley, Ruth H., Statistical Department, Tri-Utilities Corporation, 40 Exchange 
Place, New York City 

Burdell, Edwin S., Ohio State University, Commerce Building, Columbus, Ohio 

Burwell, Laurence K., Automatic Signal Company, 205 Church Street, New Haven, 
Conn. 

Calder, Philip R., United Fruit Company, 1 Federal Street, Boston, Mass. 

Cameron, James C., Queen’s University, Kingston, Canada 

Campbell, Robert L., 223 Thompson Avenue, East Liverpool, Mich. 

Carr, Francis J., Statistical Section, Hahn Department Stores, Inc., 11 West 42 
Street, New York City 

Case, Harold C. M., College of Agriculture, Urbana, II. 

Cavanaugh, Eleanor S., Standard Statistics Company, Inc., 200 Varick Street, 
New York City 

Chiu, Alfred K., Harvard College Library, Cambridge, Mass. 

Clark, Florence M., Bureau of Business and Social Research, University of Buffalo, 

Buffalo, N. Y. 
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Crowe, Stanley E., Michigan State College, East Lansing, Mich. 
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REVIEWS 


Employment Fluctuations in Pennsylvania, 1921 to 1927, by J. Frederick Dewhurst. 
Special Bulletin No. 24, Department of Labor and Industry. Harrisburg: 
Bureau of Publications, Commonwealth of Pennsylvania. 1928. ix, 192 pp. 


Our conception of the magnitude of the problem of unemployment has come 
largely from our studies of employment fluctuations. We have, in this country, 
no authentic record of the volume of unemployment; we do not know what is 
normal unemployment and we do not know how to evaluate the estimates which 
we get from time to time. The current decennial census will give us a bench- 
mark, for which we ought to be tremendously grateful because it will serve to 
demonstrate the necessity of other and continuous observations. We do have, 
however, some splendid current and historical records of fluctuations in employ- 
ment, covering significant industries and industrial regions. The present volume 
is one of these. Mr. Dewhurst explains in his preface that the data on which his 
report is based were collected from employers of Pennsylvania on a voluntary 
basis through the collaboration of the Federal Reserve Bank of Philadelphia and 
the Department of Labor and Industry of the Commonwealth of Pennsylvania. 
Mr. Dewhurst has done a fine piece of work in his judicious collection of data, his 
construction of index numbers, and his analysis of the results; incidentally, his 
work is an example of what can be accomplished in states in which current infor- 
mation on employment is not collected on an obligatory basis. 

The report consists, essentially, of three divisions: problems of collection and 
organization of employment reports; methods of summarizing the data; and 
analysis of employment fluctuations in Pennsylvania as shown by the data. 
Through a variety of sources, reports were secured from about 1,200 employers of 
labor in the State, in manufacturing, retail and wholesale trade, construction, 
steam and electric rail transportation, anthracite mining, and telephone commu- 
nication. The reports cover numbers employed, payroll, and employee-hours. 
The reporting firms represented large proportions of the wage earners in the 
industries covered in manufacturing (40 per cent), in anthracite mining (54 per 
cent), and in transportation and communication (30 per cent). The results in 
the other cases could not be stated accurately. In general, the samples were 
consciously selected in order to secure a large proportion of wage earners, wide- 
spread geographic distribution, and adequate representation to both large and 
small firms. Some fortuitous selection is present (inevitably) because of the 
bearing of available list of names and records. The final sample is rather heavily 
weighted with the larger concerns (in manufacturing, 64 per cent of the wage 
earners covered are in plants employing over 500 workers, and the average of 
reporting firms is 361 workers as compared with an average of 81 workers for the 
firms covered by the Census of Manufactures in the State), but Mr. Dewhurst 
shows that there is no wide divergence between the employment fluctuations of 
large and small plants. I think that other workers in the field will agree that this 
does not vitiate the sample. 
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The method employed in the calculation of index numbers of employment in 
the several industries is a modification of the fixed-base-chain-index method 
with a changing list of firms. The changing list of firms allows for growth and 
flexibility in handling small monthly fluctuations in the number of reports re- 
ceived in time for inclusion. The modification in the “fixed base’’ consists in 
computing estimates of numbers employed in the base period by those concerns 
whose records for the base year are not available, on the basis of the ratio of 
change shown by identical establishments for which the data are available. Mr. 
Dewhurst has guarded against error by checking his material against census 
records. He has adjusted his index numbers between 1923 and 1925 in order to 
have them correspond with the trend in employment as shown by the Census of 
Manufactures. The index numbers for the several industries were placed on a 
1923-1925 base and combined by weighting in accord with the relative impor- 
tance of the several industries as employers of labor. Seasonal adjustments were 
made wherever required. 

In the analysis of his results, the author discusses the seasonal and cyclical 
influences at work in this period and makes several interesting comparisons of his 
employment indices with other measures of economic activity. It is rather sur- 
prising to find seasonal charts for 30 of the 52 manufacturing industries considered. 
The author’s discussion of seasonality appears uncritical and the reviewer doubts 
that seasonality can be demonstrated in a considerable number of the cases pre- 
sented. Any of the conventional methods will yield seasonal coefficients when 
applied to practically any economic series. It is not enough to present seasonal 
index numbers and curves; the existence of normality must be demonstrated since 
a seasonal index is nothing more than an average. 

In order to establish the reliability of the index numbers of employment, they 
are compared with other economic series such as the ratios of positions to appli- 
cants for work, help wanted advertising, applicants for relief at charitable agen- 
cies, carloadings, etc., and the results indicate a high degree of reliability. Mr. 
Dewhurst finds in his analysis, that employment fluctuations in plants making 
producers’ goods are more severe than the corresponding fluctuations in plants 
turning out consumers’ goods, that there is no great difference between the month 
to month fluctuations in large and small plants, and that there is a high degree 
of correlation between employment fluctuations and retail trade, the former 
usually preceding the latter by two or three months. 

The monthly index numbers by industries and localities are listed in an appen- 
dix and the current indices are carried in the reports of the Federal Reserve Bank 
of Philadelphia. The author’s summary of a couple of pages or so contains an 
extraordinarily good and pithy statement of the problem of unemployment. 
Finally, it is a pleasure to note that the text, unlike so many statistical studies, is 
characterized by a very readable, easy-flowing style. 

Rapa J. WATKINS 
National Bureau of Economic Research 
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Quelques Aspects de l’évolution des Prix au Siécle Dernier et en Notre Temps, by 

Maurice Virlogeux. Paris: Giard. 1927. 235 pp. 

Writing with a clear and trenchant style, Dr. Virlogeux masses facts and 
arguments in a striking challenge to those theorists who see in the monetary 
factor “the prime antecedent, the profound cause, the chef d’orchestra’”’ of the 
numerous phenomena of price variation. In the historical record the author 
distinguishes three great classes of periods. These are: 

1. Periods in which the monetary factor does in fact play an exceptional and 
active part. These are periods of sharp and considerable inflation, during which 
no other factor exercises an influence of comparable magnitude. Characteristic 
examples are afforded by the John Law episode, and by the experience with 
assignats. 

2. Periods during which fluctuations in exchange rates control the movements 
of prices, the exchange factor outranking in importance internal monetary 
changes and movements of other economic factors. The author suggests that the 
experience of France during 1926 furnishes an instance of this type. 

3. Periods marked by greater complexity, when there is no evidence of the 
domination of any one factor. In France such periods have been the most 
numerous, the most interesting, and the most difficult to deal with. These 
periods are of several types. First, we have periods during which the quantity of 
precious metals varies materially, but exchange rates remain constant or prac- 
tically constant. The decade of 1850-60 furnishes an example of such 
a period. Secondly, we have periods during which exchange rates remain fixed 
while the monetary factor varies because of fluctuations in the supply of paper 
money. The period 1914-1918 exemplifies these conditions. In the third 
place the author distinguishes periods during which the quantity of money 
remains constant or varies but slightly, while exchange rates vary materially. 
France, during the years 1919-1924, passed through such a period. 

The author’s conclusions rest upon the results of intensive studies of economic 
phenomena in France during the above periods. With reference to the main 
point at issue the author finds that the quantity theory of money, in the clear-cut 
form which is identified with the name of Irving Fisher, is inadequate as an 
account of the actual processes of price variation, except under the double con- 
dition of excessive emission and suppression of ordinary international economic 
ties. Under other conditions the “evolution of price” is explained in terms of 
numerous non-monetary factors, of which the most important are conditions of 
production and fluctuations in exchange, the latter, in its turn, reflecting diverse 
political, financial and commercial influences. The réle of money, whether in 
metallic or credit form is, in general, says Virlogeux, an accessory one. The 
monetary factor responds to the strong impulses communicated by changes in the 
conditions of production and of other economic elements, but except under special 
conditions it cannot play a commanding part. The sharp, quasi-mathemati- 
cal systematization of phenomena afforded by theories in which this factor plays 
a leading réle is, therefore, says Virlogeux, unrealistic and misleading. 
FREDERICK C. MILLS 
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The Economic Development of Post-War France, by William F. Ogburn and William 
Jaffé. New York: Columbia University Press. 1929. . xii, 613 pp. 


This volume is the third in a series of six investigations edited by Professor 
Carlton J. H. Hayes, and entitled Social and Economic Studies of Post-War 
France. It is concerned primarily with production, and is divided into two main 
sections. The first gives a general survey of the vicissitudes of French economic 
life since 1914, with especial reference to the reorientation brought about first by 
the War and later by currency depreciation. The second section is devoted toa 
more detailed examination of the five leading industries and of agriculture, with 
additional chapters on agriculture and the combination movement. 

Taken as a whole, the evidence submitted clearly indicates that France suf- 
fered less from the post-war upheavals than any other major European belliger- 
ent, and underwent a less drastic shift in the currents of her economic life. She 
lost over two million men in the War itself (a loss part of which, however, was 
later made good by immigration); her northern provinces were devastated; and 
the post-war inflation obviously forced an extensive redistribution of her wealth 
and income. But there the main categories of loss stop. Unlike England, 
post-war France was not torn by protracted politico-economic struggles of suffi- 
cient severity to threaten her economic existence itself; unlike Germany, 
she did not go through an inflation so extreme as to paralyze all eco- 
nomic activity. Though agriculture lags even now, her foreign trade regained 
the pre-war volume as early as 1923, and her industrial production by 1924. 
Nor was she compelled to turn to foreign lenders for any large amounts of addi- 
tional capital, to replace the sums dissipated by war and inflation. France today 
is strong, prosperous, and as nearly self-sufficient as any modern nation can easily 
be. The explanation of the rapidity of her recovery seems to lie in several 
different directions. In the first place, the losses resulting directly from the War 
were confined pretty much to the invaded provinces; no general disorganization of 
the French economy as a whole developed. In the second place, the return of 
Alsace and Lorraine brought with it great wealth in iron ore, potash, iron and steel 
plants, and textile equipment; and although the lack of adequate markets made 
these new possessions a doubtful blessing at first, a marked increase in the aggre- 
gate national output and prosperity eventually resulted. Finally, the depre- 
ciation of the currency, severe though it was, seems to have had singularly little 
depressing effect upon the physical volume of production or on the general pat- 
tern of French economic life. It was chiefly the distribution of wealth and 

income which was altered, rather than the absolute volume; and in some direc- 
tions a persistent stimulus to production was apparently felt. One is almost 
forced to the conclusion that whereas Germany had too much inflation and 
England perhaps too little, France hit it about right! 

The studyis not easy to criticize on matters of substance except by one intimate 
with the details of the French situation, but it seems to rest on an exhaustive use 
of the available materials. The materials themselves, however, are often re- 
grettably limited, and force the authors back on qualitative generalizations just 
when quantitative analysis would have been most illuminating. The failure to 
bring most of the data beyond 1926 is also irritating, but was perhaps un- 
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avoidable. The organization of the book itself is also open to some objection, 
since it does not proceed on any very consistent and progressive plan. Finally, 
various lacunae in both substance and general reasoning are explained by the 
fact that this volume is only one of six, three of them still unpublished, which 
cannot fairly be judged independently of one another. 

James W. ANGELL 


Columbia University 


The Farmer’s Standard of Living, by E. L. Kirkpatrick. New York: The Century 

Company. 1929. xiii, 299 pp. 

This book should be a good tonic to city people who have formed an exagger- 
ated picture of farm conditions in terms of labor income and per cent of return on 
investment. It is a sane, well-balanced book, packed with much useful infor- 
mation. 

“The farm family which moves to the city primarily in search of a larger in- 
come is likely to meet with disillusionment. The satisfactions of farm life are not 
all in the pay envelope.” The book makes many interesting comparisons be- 
tween farmers and city wage-earners. Some of these will be noted below. 

The author begins with a definition of the term standard of living. The term 
may be used, he suggests, in two ways: (1) the actual standard of living, which 
means the “array of goods used in meeting the needs and wants of the family”; 
(2) the desired standard of living, the “array of goods desired or regarded essential 
to meet the needs and wants” of the family. This is certainly clearer than the 
classification made by certain writers who have termed the former the plane of 
living or the scale of living; the latter, the standard of living. 

But here a difficulty arises. The author straightway proceeds to measure the 
“actual standard” in terms of the money value of the goods used during a year. 
Two people may expend exactly the same amount of money during a year. One 
spends his surplus, available over and above neccessities, on material 
luxuries; the other spends his surplus on cultural satisfaction, books, concerts, 
theatres, pictures, travel, etc. Now these two people obviously do not have 
“standards of living” that are at allequal. Yet measured in terms of the money 
spent on goods and services consumed the actual standard is the same. Dr. 
Kirkpatrick does not fail to recognize this distinction and frequently a qualitative 
view of the standard of living of American farmers from a cultural standpoint is 
noted in the book, but the money measurement predominates. It is also worthy 
of attention that there are differences in the efficiencies of consumers as well as of 
producers. The money value of an income is not a true measure of the 
actual standard even considered from a quantitative standpoint. 

On page 76 is an interesting table comparing the incomes of 2,886 farm families 
in 11 states for 1922-24 with those of 12,096 workingmen’s families in 92 industrial 
centers for the year 1918. The modal and average incomes are substantially the 
same in both cases and the proportion of the incomes spent on food, clothing, 
rent, and “‘all other” is also substantially similar. Dr. Kirkpatrick does not 
think that the data are sufficiently adequate to show that farm families have 
higher actual standards of living than workingmen. In both groups there are 
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many families living at the poverty level. The author finds (though he is 
careful to state that the comparison is suggestive rather than conclusive) that 
the farmer’s diet is richer in meat, eggs, cheese, milk, cream, fruit and vegetables 
than that of the workingman. The workingman’s diet runs relatively more to 
cereals. As to clothing he finds the expenditures of the two groups to be about 
the same. About 5 per cent of 2,886 farm houses studied in 11 states were com- 
pletely modern, about 20 per cent partly modern, 7.e. have either central heating, 
electric lights or running water. About 75 per cent had none of these improve- 
ments. Farmers spend only about half as much on furnishing and equipment as 
city wage-earners. The consumption of coal is about the same for both groups, 
though city wage-earners spend much more on gas and electricity. Expenditures 
for “health” appear about the same for farmers and city wage-earners. The 
comparison of death rates in country and city (p. 144) is of little value since crude 
death rates are used. A larger percentage of rural school children are defective 
physically compared with city children. In this regard the children from the 
rural villages make a poorer showing than farm children. Expenditures for 
health and life insurance are on the average about the same for farmers and city 
wage-earners. The same is true of recreation expenditures. Most readers, I 
think, will be surprised at many of these comparisons. They are, however, quite 
similar to the findings of Professor C. C. Zimmerman of the University of Minne- 
sota,! who takes as his basis of comparison the amounts spent over and above the 
expenditures for food, clothing and household purposes, i.e. for “‘non-physiologi- 
cal” purposes. His conclusion is that farmers are better off as to both incomes 
and standards of living than the lower two-thirds of the urban population. 

The correlation between education and income is brought out in a table on 
page 219. When both farm operator and homemaker had an 8th grade education 
or less, the income was $1,484, those with 8th to 12th grade education had an 
income of $1,755, and those with more than a 12th grade education, had an 
income of $2,032 on the average. In Nebraska, of 1,145 families, studies show 
that 3.6 per cent of the grown sons and daughters were college graduates, and 20 
to 25 per cent had finished high school. 

The author is not alarmed over the mortgage debt situation. He finds that 
farm property is not overburdened with debts in comparison with other indus- 
tries. Nor does he find any relationship between mortgage indebtedness and 
standard of living. This may be true, but a general relationship does not always 
tell the true story. The agricultural crisis, the farm bankruptcies and rural bank 
failures are certainly not unrelated to the load of indebtedness contracted at the 
high war prices. The discrepancy between the present price level and the price 
level at which debts from 1918 to 1920 were contracted is quite possibly 
of greater significance than the maladjustment between the current prices of 
farm products and of manufactured goods. Indeed, farmers who contracted 
no debts at the high price level are in the main not suffering from agricultural 
depression. 


University of Minnesota 


1 Bulletin 255: Incomes and Expenditures of Minnesota Farm and City Families, 1927-28, St. 
Paul, 1929. 


Atvin H. HaNnsEN 
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Hand-to-Mouth Buying: A Study in the Organization, Planning and Stabilization of 
Trade, by L.S. Lyon. Washington: The Brookings Institution. 1929. xv, 
487 pp. 

The publication of Mr. Lyon’s latest book occurred at a singularly opportune 
time. From 1920 to 1928 hand-to-mouth buying was under almost constant 
discussion, but during the latter year interest waned, and in 1929 financial and 
monetary problems became the absorbing topics of the day. Hand-to-mouth 
buying was accepted by the majority of business men as an established practice 
to which methods of production, marketing, and delivery had already been 


adapted. After the collapse in security prices, however, analysis of the current 


business situation disclosed the existence of low inventories in the majority of 
industries, and this fact was fastened upon by myriad forecasters as the prime 
reason why the business recession would be short-lived. Since low inventories 
are mainly the result of hand-to-mouth buying, this subject has regained a 
part of its former vigor, and Mr. Lyon’s study should prove instructive. 

The study is divided into four major parts: (1) the setting; (2) the evidence; 
(3) effects and concomitants; and (4) the permanency of the new conditions. 
Causes of hand-to-mouth buying are discussed in some thirty-two pages in Parts 
One and Four. This small amount of space prevents the comprehensive con- 
sideration of price movements, style trends, transport changes, and inventory 
control, and Mr. Lyon’s failure adequately to consider these underlying causes 
detracts from the value of his work. Part Four also includes an historical study 
of hand-to-mouth buying, and it is clearly shown that this practice is a recurrent 
cyclical one and not a new development since 1920. 

In Part Two the evidence is presented. This includes figures on the amount of 
advance orders relative to the volume of business done and on the size of indi- 
vidual orders in various industries. Although industries vary greatly, the 
figures on advance orders show in general a downward trend. With the size of 
the orders the case is not so clear, but as Mr. Lyon states, “‘There is reason 
to believe that small orders have been a more serious phenomenon and a more 
continuous one since 1920.” Part Three is concerned with size of stocks of 
goods in the various stages of production and with the stability of the flow of 
goods, orders and shipments. Again it is safe to agree that stocks in general are 
lower than in the past and that business is somewhat more stable. These are, 
however, generalizations which cannot be made in regard to individual industries 
and companies. 

In both Parts Two and Three Mr. Lyon presents a mass of statistical material, 
a part of which is valuable, a part interesting, and a part of little value. The 
data on the canning industry are exceptionally fine, and on shoe manufactur- 
ing there are some excellent figures. On other industries, notably the textile 
and the automobile, data are meagre, and in many cases figures are so frag- 
mentary that their value is dubious. Wholesaling and retailing are not well 
represented. In several instances figures go back only to 1923 or 1924, and in the 
majority of series there are no figures for the period prior to 1920. Conclusions 
to be sound should be based upon a study of conditions prior to the period of 
extreme price inflation. Relatively few inventory figures are given: the bulk of 
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the figures on stocks of goods are those for entire industries, taken from the Survey 
of Current Business. Whereas these figures have value as circumstantial evi- 
dence, the study would have been greatly strengthened by the inclusion of actual 
inventory figures for a wide range of industries. Moreover, it does not seem 
necessary to include statistics for the seasonal products of agriculture where the 
total stock remains the same, regardless of whether hand-to-mouth buying is 
practiced or not. 

But, although there are certain weaknesses in the data, the collection of past 
figures on business, especially for individual companies, is a difficult task, and 
Mr. Lyon has amassed a considerable amount of material. A greater weakness 
lies in his failure to seek beyond his data and to probe the effects of hand-to- 
mouth buying upon the marketing and production methods in the various fields. 
The interrelations between production and hand-to-mouth buying are notably 
neglected. The space used for the re-working and charting of published 
data could have been devoted to a description of the changes in business practice 
which have occurred. For example, Mr. Lyon’s discussion of shoe manufacturing 
could have been amplified by a consideration of in-stock departments, production 
periods, market analysis, style simplification, and ‘‘blanket”’ orders. It could 
have been elaborated by a description of the repercussions of changed buying 
methods upon the tanners of leather and the suppliers of shoe findings. Thus, 
even though Mr. Lyon’s general conclusions are sound, the work will 
disappoint the reader who seeks a thorough study of the causes and effects of 
hand-to-mouth buying in the so-called “new era.” 

JoHN W. HarRRIMAN 

Harvard University 


Scientific Method, by Truman Lee Kelley. Columbus: The Ohio University 

Press. 1929. 195 pp. 

The subject matter of this book is a course of five lectures delivered at the Ohio 
State University under the auspices of the Graduate School and the Department 
of Psychology. The second lecture, on the “Rdéle of Judgment,” is based upon a 
questionnaire submitted by the author to his colleagues at Stanford University. 
Class reports furnished a large part of the material for the final lecture on “‘ Men 
of Science.” 

The topics discussed have both a general and a special interest. Lectures I, 
II, and V deal with essential principles common to much scientific investigation; 
the adaptation of methods to the specific field and type of inquiry, various 
considerations affecting the use of questionnaires, and the mental traits of men 
who have been recognized as successful scientists. Lectures II and IV are more 
specialized. In one the author discusses the units to be used in the measurement 
of intelligence and achievement; in the other he inquires as to the bearing of recent 
scientific developments upon problems of education and inheritance. 

The opening lecture stresses the point that in research there is no single scien- 
tific method. The appropriate procedures depend upon the lines of approach to 
the subject. The inquiry may have historical aspects and for these the historical 
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method is to be used. Relationships and changes in the present may be investi- 
gated by direct observation through the use of the experimental method. Or, we 
may observe and record our observations in quantitative form when it is difficult 
or impossible to control the varied and complex factors under investigation, either 
in the present or over a period of time. Thus, the statistical method to a large 
extent is a substitute in the social sciences for the experimental method so suc- 
cessfully used in the physical sciences. Finally, the subject of inquiry may be 
related to the future and can be studied in this aspect by the laws of probability 
and by the technique of prediction. 

From the final lecture the reader will receive both vision and specific suggestion. 
It is devoted to “‘a study of what research has meant in the world and some ac- 
quaintance with its process and with the characters of the men who have made it 
mean what it does.” The objectives of scientific method and research have been 
clearly understood since Aristotle. His followers listen to him saying, “‘Let us 
first understand the facts and then we may seek their causes.” Again, ‘We must 
not accept a general principle from logic only, but must prove its application to 
each fact, for it is in facts that we must seek general principles, and these must 
always accord with the facts.” But Aristotle made few observations himself, 
and had a very limited range of facts from which to generalize. His was a 
philosophical age. 

We cumulate and diversify methods of attacking problems as we cumulate 
culture. The successful physical scientists have combined with deductive 
reasoning the technique of observation by experimentation. The quest 
has been ever for facts and more facts. From childhood, Charles Darwin, 
destined to become one of the world’s greatest scientists, had a passion for 
collecting things and a genius for observation. Then he made a discovery 
which changed his outlook, he found in the middle of England what he thought 
was a tropical shell. He tells us how surprised he was “at Sedgwick not being 
delighted at so wonderful a fact.” As for himself, he says, “‘ Nothing before had 
ever made me thoroughly realize . . . that science consists in grouping facts so 
that general laws and conclusions may be drawn from them.” Recognizing, 
henceforth, the need for a purpose, a working hypothesis, in scientific work, 
Darwin saw clearly the perils of limiting the field of inquiry. He says, ‘‘I have 
steadily endeavored to keep my mind free so as to give up any hypothesis, however 
much beloved (and I cannot resist forming one on any subject), as soon as the 
facts are shown to be opposed to it. . . . I cannot remember a single first-formed 
hypothesis which had not after a time to be given up or greatly modified.”’ 

As exemplified in the attitude and work of Darwin and others, scientific method 
seems to have the following characteristics: 


(1) Delimitation of the subject of investigation, based upon previous achieve- 
ment. 

(2) Wide acquaintance with facts, collected, classified and related, in the 
chosen field. 

(3) Working hypothesis, formulated in such a manner as to account for the 

facts. 
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(4) Elaboration of the original hypothesis and the collection of more facts by 
observation, or by carefully controlled experimentation, leading usually 
to modification or rejection of the working hypothesis. 


If the essential of the scientific method is the collection and understanding of 
facts, it is easy to understand why the present is an age characterized by quanti- 
tative measurements and the development of the statistical method. There are 
many variables in describing modern social phenomena and the range of variation 
is often very wide. Comparison of one series of facts with another is often im- 
possible except through the aid of statistical analysis. One single fact may 
contradict another when standing alone. For example the hypothesis may be 
that infant health is related to the size of the family income and the standard of 
living. Nevertheless, an infant in a low-income family survives while another 
infant dies in a family with a high standard. It is only when enough cases are 
assembled to permit the computation of infant death-rates according to the aver- 
age size of family incomes, that the original hypothesis can be tested. 

The general reader on scientific method may find it desirable to pass over Lec- 
tures III and IV. The author has incorporated philosophical, historical and 
recent experimental material in this series in the hope of throwing “new light 
upon the immediate problem of research, education and eugenics.” 

Rosert E. Caappock 


Columbia University 


The Statistical Method in Economics and Political Science: a Treatise on the 
Quantitative and Institutional Approach to Social and Industrial Relations, by 
P. Sargant Florence. New York: Harcourt, Brace and Company. 1929. 
xxiv, 521 pp. 

The author in his preface states: “‘This . . . is addressed to all students of 
economics and political science who find theories unsatisfactory without the test 
of fact; and to all students of statistics who find figures or mathematical technique 
barren without application to life. These students may be just general readers 
pursuing their interest for interest’s sake; they may be learners and apprentices 
in research wanting additional training in theory and technique; or they may al- 
ready be specialists or teachers who find the survey and solution of social ques- 
tions impossible without recourse to statistical exposition.” 

This is a large order and the author realizes that it is. He seems to fear that he 
may fall down in trying to do everything for everybody and finally accomplish 
nothing for anybody, and the reviewer is inclined to share his fears. Each reader 
must decide for himself whether he has been helped by the book. American 
social science seems to have pretty much passed through the phase where this 
type of effort is useful; Britain may still be in it. The volume is one in the long 
series entitled The International Library of Psychology, Philosophy and Scien- 


tific Method. 
E. B. W1Lson 
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Economic Principles of Consumption, by Paul H. Nystrom. New York: The 

Ronald Press. 1929. xi, 586 pp. 

The fact that this is the third book on consumption to appear in a little over a 
year is in itself worth noting. All three books—Waite, Hoyt, and Nystrom— 
were, moreover, the by-products of courses in consumption and, judging from 
organization and format, were intended to serve as texts in other courses. Con- 
sumption, it would seem, can no longer be called the most neglected field in 
economics. 

These books show, however, in other fashion than by their mere existence that 
consumption properly conceived is by no means the completely neglected field 
that some have characterized it. The authors assume that the new studies of the 
statistical laws of demand are studies of consumers’ behavior; they assume that 
the old “cost of living” studies are studies of consumption as affected by income, 
size and composition of family, or other variable. They show that studies of 
purchasing habits are studies of one aspect of consumption and that students of 
human motivation and of culture history are throwing light on the fundamental 
problems of consumers’ behavior. 

No one of these three books alone, however, gives a complete view of the field 
of consumption. Noone of them adequately defines consumption, lays out the 
field, discriminates between its economic and non-economic aspects, or even hews 
closely to the line laid down by the author in statement of plan and purpose. 
Each book suffers in logical organization to some degree from the fact that it rep- 
resents topics to be covered in a course rather than a clear cut and definite 
attempt to “set” the problems of consumption as such and show what belongs 
and what does not. 

Mr. Nystrom in his preface says, “This book deals with consumption—what 
people want and why, . . . It indicates what is consumed, . . . Together with 
. . . Economies of Fashion (it) presents a description and explanation of con- 
sumption.” The reviewer would say, selecting from the above, it indicates what 
is consumed; it presents a description of consumption. In other words, the 586 
pages are a mine of information in regard to facts and trends in expenditures for 
food, clothing, housing, home furnishing and operating goods, ete. Mr. Nystrom 
brings together the best data available in regard to current wealth and income, 
size and composition of families, distribution of population, and presents an 
interesting description of the ten levels of living that he sees exemplified in the 
United States today with an estimate of the number of individuals and families 
living at each level. 

What the book does not do in the opinion of the reviewer is explain comsump- 
tion, or indicate why people want what they do. In other words, the author does 
not in any thorough-going fashion either apply himself to Miss Hoyt’s fundamen- 
tal problem—how did these particular interests arise among this group of people? 
—or follow up Mr. Waite’s suggestion—that modern statistical techniques ap- 
plied with imagination and resourcefulness to the analysis of quantitative data 
would disclose significant laws of consumers’ behavior. 

In view of the factual and descriptive character of Mr. Nystrom’s discussion, 
the choice of title, Economic Principles of Consumption, seems especially inap- 
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propriate. The one criticism that can be made of the book is that it contributes 
little to the discovery of “what order, law or principle underlies or governs con- 
sumption” which Mr. Nystrom himself says is the first object of a scientific study 
of consumption. What it does give is a substantial body of factual information 
derived from a careful survey of all available sources. Penetrating comments 
well worth the attention of students in this field are not wanting, for example, the 
discussion of the classification of expenditures, hitherto very unintelligently 
handled. A classified bibliography of 12 pages and an index of 26 add to the use- 
fulness of the book. Above all it should be emphasized that it is only by such 
works as this of Mr. Nystrom, appraisals and new attacks, that the field of con- 
sumption can become not only cultivated but fruitful. 
Hazet Kyrx 


University of Chicago 


The Five Day Week in Manufacturing Industries, by the National Industrial 

Conference Board. New York. 1929. 69 pp. 

This is an examination of the scope and technique of, and the experience with, 
the five day week in domestic manufacturing industry as disclosed by informa- 
tion supplied by 270 establishments with a total of 218,219 employees in 1928. 

The technique by which the five day week has been adopted varies depending on 
many circumstances such as whether the manufacturing process is continuous or 
non-continuous, whether or not the total number of hours worked per week has 
changed and whether or not there are changes in the wage rates. This shorter 
work-week predominates in plants where the manufacturing process is non- 
continuous. In general the total number of hours worked per week has either 
remained the same or has been but slightly reduced. Concerning wages, the 
report states that “in round numbers employees in 75 per cent of the companies 
which furnished information on this subject (7.e., 150 out of 270) did not suffer loss 
of wage income from the change to the five day schedule.”” What proportion of 
the workers is covered by this same statement is not disclosed. 

Experience with the five day week also differs, although the Board concludes 
that “objections are fewer and, as a rule, less fundamental in character than the 
advantages cited.” Employers appear to be much influenced in their views by 
whether the five day schedule was adopted with their approval or whether it was 
forced upon them. 

In considering the scope of the movement, this appears to be limited in terms of 
establishments and number of workers involved. If the garment trade, partic- 
ularly in New York, and the Ford Motor Company are excluded, the scope of 
the movement appears to be very slight as yet. About two-thirds of the estab- 
lishments which report regular operation on a five day basis have less than 100 
workers and only a small proportion of the employees reported are working under 
union agreements. About 70 per cent of the reporting establishments are in 
New York State alone. This, it is explained, ‘is undoubtedly due in part to the 
fact that this section is predominately the manufacturing section of the country 
and in part to the prevalence in this section of the garment and printing indus- 
tries.” 
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In terms of social importance the growth and status of the five day week lies 
“more in the future possibilities than in the present importance of this work 
arrangement.” 

The usefulness of this study would be increased for some purposes if informa- 
tion were given which would make possible a judgment as to the representative- 
ness of the sample. It is desirable to know the basis on which firms were 
selected, and what the distribution was, including geographical, of the original 
inquiry and also of the replies. 

H. LaRue Frain 

University of Pennsylvania 


Exercise Manual in Statistics, by Karl John Holzinger and Blythe Clayton 
Mitchell. Boston: Ginn and Company. 1929. 160 pp. 


Though the title does not so indicate, the authors have prepared this book of 
exercises for the use of students of education. They state also that ‘“‘ Adequate 
drill and practice is rarely furnished by the current textbooks because the prob- 
lems are too few and they are not graduated in difficulty. The present manual 
should, therefore, be of service not only to the classroom teacher but to many 
students requiring special practice.” The problems are so arranged that they 
may be used not only with Professor Holzinger’s text but with any of the present 
textbooks on educational statistics. Chapters deal with tabulation and classifi- 
cation, graphs, averages, dispersion and skewness, the point binomial and the 
normal curve, sampling, correlation, and curve fitting. Correlation is treated in 
four chapters. One is devoted to linear correlation; a second treats of the corre- 
lation ratio, but ventures no further into the field of non-linear correlation; a 
third deals with rank correlation, biserial r, and the coefficient of contingency; a 
fourth presents partial and multiple correlation. 

In addition to the exercises proper, a brief amount of explanatory matter and a 
list of references accompany each section. At the close of the book the answers 
for the 486 exercises are given. 

The exercise book contains no tables. This is doubtless because the authors 
feel that the need for tables is filled by Professor Holzinger’s Statistical Tables for 
Students in Education and Psychology. 

FREDERICK E. Croxton 

Columbia University 


Handbook of Financial Mathematics, by Justin H. Moore. New York: Prentice 
Hall, Inc. 1929. xvi, 1216 pp. 


It is customary to publish handbooks in the various fields of science for the 
purpose of assembling systematically in one unit fundamental principles relating 
to a particular domain. The guiding thought in the case of the Handbook of 
Financial Mathematics has been to present a complete review of the mathemati- 
cal theory applied to the specified subject. The volume, which is a veritable 
encyclopedia, covers forty-four chapters, eleven tables, a bibliography, answers to 
problems, a list of symbols and an index, these chapters making up the impressive 
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total of 1,216 pages. Among the many noteworthy features is the methodical 
exposition of the subject. Examples shown, are clearly set off in the text and all 
data necessary for the solution of a particular problem are “framed” by a line 
box, thus the reader sees at a glance the respective essentials. Intermediate steps 
of each calculation are also given as far as space permits, so that the consultant 
can follow gradually the mathematical developments, a feature frequently 
neglected in mathematical presentation. 

Very helpful for the understanding of the mathematical theory are twenty- 
nine diagrams distributed throughout the book. Furthermore, the volume con- 
tains detailed information on algebraic technique, logarithms and probability, 
the latter in reference to life insurance. Numerous tables are inserted in the 
text besides 68 pages of tables in the appendix. The book has been written for 
the “shop” with the particular aim in mind to meet any problem which might 
arise on short notice, without being compelled to go through a mass of theoretical 
material. However, there is abundant reference to theoretical aspects which is 
segregated in such a way as not to interfere with those parts designed for rapid 
consultation. The publisher has left nothing undone to give the handbook an 


attractive appearance. 
R. von Huxn 


An Examination of Earnings in Certain Standard Machine-Tool Operations in 
Philadelphia, by H. LaRue Frain. Philadelphia: University of Pennsylvania 
Press. 1929. xiv, 81 pp. 

In this short report the author examines certain variable factors which affect 
earnings. He has confined his study to seven skilled machine tool operations. 
Hourly earnings were collected for 1,456 employees in 43 shops in April, 1927, and 
hourly, weekly and yearly earnings in 1925 and 1926. Large and small factories 
were covered; the operations were standard and fairly typical of machine opera- 
tions in general. The men were unorganized and could move from job to job at 
will; and the sample was large and varied enough to risk generalizations. The 
material revealed no such phenomena as a market price for labor. Hourly earn- 
ings showed a range of $1.18 and weekly earnings of $11.50, the relative standard 
deviation of the former being 27.3 and the latter 21.2 per cent. The spread of 
hourly earnings was not uniform among the occupations either, varying from 68 
cents for drill press and planer operating to $1.18 for milling machine operating. 
Whether or not larger difference than this would appear if the wages were pro- 
portionate to efficiency the author does not feel he has adequate material to 
reveal. 

Because of the paucity of material on methods of wage payment, the results of 
an analysis of payment by time, piece or bonus are of much interest. Hourly 
earnings are lowest when wages are paid to time workers, and are 10 and 20 per 
cent higher to piece and bonus workers. 

The material reveals a lack of uniformity in the normal or full time week, the 
absence of any benefit in earning power in a period of service longer than ten 
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years, little correspondence between actual and normal time, and the inadequacy 
of the mean or median as descriptive of the material analyzed. The study is 
carefully done, cautious in its conclusions, and critical in its attack. It is from 
more small studies of this nature that we can build a theory of wages which will 
have close relation to actual facts. 

Guapye .2. FRrEDMAN 


New York City 





